# Zonos2 Text to Speech

> Zonos2 is a text-to-speech model that clones a voice from a short sample and speaks naturally across many languages.


## Overview

- **Endpoint**: `https://fal.run/fal-ai/zonos2`
- **Model ID**: `fal-ai/zonos2`
- **Category**: text-to-speech
- **Kind**: inference
**Description**: Zonos 2 is an open-source, real-time text-to-speech model from Zyphra that creates natural, expressive speech and clones a voice from just a short audio sample. Give it a few seconds of reference audio plus a line of text, and it speaks that text back in the cloned voice.

**Tags**: text-to-speech, tts, voice cloning



## Pricing

- **Price**: $0.01 per minutes

For more details, see [fal.ai pricing](https://fal.ai/pricing).

## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:


- **`reference_audio_url`** (`string`, _required_):
  Reference audio to clone the voice from.
  - Examples: "https://storage.googleapis.com/falserverless/model_tests/zonos/demo_voice_zonos.wav"

- **`text`** (`string`, _optional_):
  Text to synthesize in the cloned voice. Default value: `""Fal" is the fastest solution for your audio generation."`
  - Default: `"\"Fal\" is the fastest solution for your audio generation."`
  - Examples: "\"Fal\" is the fastest solution for your audio generation."

- **`language`** (`string`, _optional_):
  Text-normalization language code. Supported: en_us, en_gb, fr_fr, de, es, it, pt_br, ja, cmn, ko. Unsupported codes skip normalization. Default value: `"en_us"`
  - Default: `"en_us"`
  - Examples: "en_us"

- **`seed`** (`integer`, _optional_):
  Seed for reproducibility. Random when omitted.
  - Range: `0` to `2147483647`

- **`accurate_mode`** (`boolean`, _optional_):
  True = closer voice match; False = more expressive. Default value: `true`
  - Default: `true`

- **`clean_speaker_background`** (`boolean`, _optional_):
  Mark the reference audio as having a clean background.
  - Default: `false`

- **`temperature`** (`float`, _optional_):
  Sampling temperature. Default value: `1.15`
  - Default: `1.15`
  - Range: `0` to `2`

- **`top_p`** (`float`, _optional_):
  Nucleus sampling probability (0 disables).
  - Default: `0`
  - Range: `0` to `1`

- **`min_p`** (`float`, _optional_):
  Minimum-probability sampling threshold. Default value: `0.18`
  - Default: `0.18`
  - Range: `0` to `1`

- **`top_k`** (`integer`, _optional_):
  Top-k sampling cutoff (0 disables). Default value: `106`
  - Default: `106`
  - Range: `0` to `2048`

- **`max_tokens`** (`integer`, _optional_):
  Maximum number of audio frames to generate. Defaults to the model context limit.
  - Range: `1` to `6144`



**Required Parameters Example**:

```json
{
  "reference_audio_url": "https://storage.googleapis.com/falserverless/model_tests/zonos/demo_voice_zonos.wav"
}
```

**Full Example**:

```json
{
  "reference_audio_url": "https://storage.googleapis.com/falserverless/model_tests/zonos/demo_voice_zonos.wav",
  "text": "\"Fal\" is the fastest solution for your audio generation.",
  "language": "en_us",
  "accurate_mode": true,
  "temperature": 1.15,
  "min_p": 0.18,
  "top_k": 106
}
```


### Output Schema

The API returns the following output format:

- **`audio`** (`File`, _required_):
  The generated audio (WAV, 44.1kHz mono).

- **`seed`** (`integer`, _required_):
  The seed used for generation.



**Example Response**:

```json
{
  "audio": {
    "url": "",
    "content_type": "image/png",
    "file_name": "z9RV14K95DvU.png",
    "file_size": 4404019
  }
}
```


## Usage Examples

### cURL

```bash
curl --request POST \
  --url https://fal.run/fal-ai/zonos2 \
  --header "Authorization: Key $FAL_KEY" \
  --header "Content-Type: application/json" \
  --data '{
     "reference_audio_url": "https://storage.googleapis.com/falserverless/model_tests/zonos/demo_voice_zonos.wav"
   }'
```

### Python

Ensure you have the Python client installed:

```bash
pip install fal-client
```

Then use the API client to make requests:

```python
import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
           print(log["message"])

result = fal_client.subscribe(
    "fal-ai/zonos2",
    arguments={
        "reference_audio_url": "https://storage.googleapis.com/falserverless/model_tests/zonos/demo_voice_zonos.wav"
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)
print(result)
```

### JavaScript

Ensure you have the JavaScript client installed:

```bash
npm install --save @fal-ai/client
```

Then use the API client to make requests:

```javascript
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/zonos2", {
  input: {
    reference_audio_url: "https://storage.googleapis.com/falserverless/model_tests/zonos/demo_voice_zonos.wav"
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});
console.log(result.data);
console.log(result.requestId);
```


## Additional Resources

### Documentation

- [Model Playground](https://fal.ai/models/fal-ai/zonos2)
- [API Documentation](https://fal.ai/models/fal-ai/zonos2/api)
- [OpenAPI Schema](https://fal.ai/api/openapi/queue/openapi.json?endpoint_id=fal-ai/zonos2)

### fal.ai Platform

- [Platform Documentation](https://docs.fal.ai)
- [Python Client](https://docs.fal.ai/clients/python)
- [JavaScript Client](https://docs.fal.ai/clients/javascript)
