# LTX-2 19B Distilled

> Generate video with audio from audio, text and images using LTX-2 Distilled and custom LoRA


## Overview

- **Endpoint**: `https://fal.run/fal-ai/ltx-2-19b/distilled/audio-to-video/lora`
- **Model ID**: `fal-ai/ltx-2-19b/distilled/audio-to-video/lora`
- **Category**: audio-to-video
- **Kind**: inference


## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:


- **`prompt`** (`string`, _required_):
  The prompt to generate the video from.
  - Examples: "A woman speaks to the camera"

- **`audio_url`** (`string`, _required_):
  The URL of the audio to generate the video from.
  - Examples: "https://storage.googleapis.com/falserverless/example_inputs/ltx-2-a2v-input-audio.mp3"

- **`image_url`** (`string`, _optional_):
  Optional URL of an image to use as the first frame of the video.
  - Examples: "https://storage.googleapis.com/falserverless/example_inputs/ltx-2-a2v-input-image.png"

- **`match_audio_length`** (`boolean`, _optional_):
  When enabled, the number of frames will be calculated based on the audio duration and FPS. When disabled, use the specified num_frames. Default value: `true`
  - Default: `true`

- **`num_frames`** (`integer`, _optional_):
  The number of frames to generate. Default value: `121`
  - Default: `121`
  - Range: `9` to `481`

- **`video_size`** (`ImageSize | Enum`, _optional_):
  The size of the generated video. Use 'auto' to match the input image dimensions if provided. Default value: `landscape_4_3`
  - Default: `"landscape_4_3"`
  - One of: ImageSize | Enum

- **`use_multiscale`** (`boolean`, _optional_):
  Whether to use multi-scale generation. If True, the model will generate the video at a smaller scale first, then use the smaller video to guide the generation of a video at or above your requested size. This results in better coherence and details. Default value: `true`
  - Default: `true`

- **`fps`** (`float`, _optional_):
  The frames per second of the generated video. Default value: `25`
  - Default: `25`
  - Range: `1` to `60`

- **`acceleration`** (`AccelerationEnum`, _optional_):
  The acceleration level to use. Default value: `"none"`
  - Default: `"none"`
  - Options: `"none"`, `"regular"`, `"high"`, `"full"`
  - Examples: "none"

- **`camera_lora`** (`CameraLoRAEnum`, _optional_):
  The camera LoRA to use. This allows you to control the camera movement of the generated video more accurately than just prompting the model to move the camera. Default value: `"none"`
  - Default: `"none"`
  - Options: `"dolly_in"`, `"dolly_out"`, `"dolly_left"`, `"dolly_right"`, `"jib_up"`, `"jib_down"`, `"static"`, `"none"`
  - Examples: "none"

- **`camera_lora_scale`** (`float`, _optional_):
  The scale of the camera LoRA to use. This allows you to control the camera movement of the generated video more accurately than just prompting the model to move the camera. Default value: `1`
  - Default: `1`
  - Range: `0` to `1`

- **`negative_prompt`** (`string`, _optional_):
  The negative prompt to generate the video from. Default value: `"blurry, out of focus, overexposed, underexposed, low contrast, washed out colors, excessive noise, grainy texture, poor lighting, flickering, motion blur, distorted proportions, unnatural skin tones, deformed facial features, asymmetrical face, missing facial features, extra limbs, disfigured hands, wrong hand count, artifacts around text, inconsistent perspective, camera shake, incorrect depth of field, background too sharp, background clutter, distracting reflections, harsh shadows, inconsistent lighting direction, color banding, cartoonish rendering, 3D CGI look, unrealistic materials, uncanny valley effect, incorrect ethnicity, wrong gender, exaggerated expressions, wrong gaze direction, mismatched lip sync, silent or muted audio, distorted voice, robotic voice, echo, background noise, off-sync audio,incorrect dialogue, added dialogue, repetitive speech, jittery movement, awkward pauses, incorrect timing, unnatural transitions, inconsistent framing, tilted camera, flat lighting, inconsistent tone, cinematic oversaturation, stylized filters, or AI artifacts."`
  - Default: `"blurry, out of focus, overexposed, underexposed, low contrast, washed out colors, excessive noise, grainy texture, poor lighting, flickering, motion blur, distorted proportions, unnatural skin tones, deformed facial features, asymmetrical face, missing facial features, extra limbs, disfigured hands, wrong hand count, artifacts around text, inconsistent perspective, camera shake, incorrect depth of field, background too sharp, background clutter, distracting reflections, harsh shadows, inconsistent lighting direction, color banding, cartoonish rendering, 3D CGI look, unrealistic materials, uncanny valley effect, incorrect ethnicity, wrong gender, exaggerated expressions, wrong gaze direction, mismatched lip sync, silent or muted audio, distorted voice, robotic voice, echo, background noise, off-sync audio,incorrect dialogue, added dialogue, repetitive speech, jittery movement, awkward pauses, incorrect timing, unnatural transitions, inconsistent framing, tilted camera, flat lighting, inconsistent tone, cinematic oversaturation, stylized filters, or AI artifacts."`

- **`seed`** (`integer`, _optional_):
  The seed for the random number generator.

- **`enable_prompt_expansion`** (`boolean`, _optional_):
  Whether to enable prompt expansion.
  - Default: `false`

- **`enable_safety_checker`** (`boolean`, _optional_):
  Whether to enable the safety checker. Default value: `true`
  - Default: `true`

- **`video_output_type`** (`VideoOutputTypeEnum`, _optional_):
  The output type of the generated video. Default value: `"X264 (.mp4)"`
  - Default: `"X264 (.mp4)"`
  - Options: `"X264 (.mp4)"`, `"VP9 (.webm)"`, `"PRORES4444 (.mov)"`, `"GIF (.gif)"`

- **`video_quality`** (`VideoQualityEnum`, _optional_):
  The quality of the generated video. Default value: `"high"`
  - Default: `"high"`
  - Options: `"low"`, `"medium"`, `"high"`, `"maximum"`

- **`video_write_mode`** (`VideoWriteModeEnum`, _optional_):
  The write mode of the generated video. Default value: `"balanced"`
  - Default: `"balanced"`
  - Options: `"fast"`, `"balanced"`, `"small"`

- **`sync_mode`** (`boolean`, _optional_):
  If `True`, the media will be returned as a data URI and the output data won't be available in the request history.
  - Default: `false`

- **`loras`** (`list<LoRAInput>`, _required_):
  The LoRAs to use for the generation.
  - Array of LoRAInput

- **`image_strength`** (`float`, _optional_):
  The strength of the image to use for the video generation. Default value: `1`
  - Default: `1`
  - Range: `0` to `1`

- **`audio_strength`** (`float`, _optional_):
  Audio conditioning strength. Values below 1.0 will allow the model to change the audio, while a value of exactly 1.0 will use the input audio without modification. Default value: `1`
  - Default: `1`
  - Range: `0` to `1`

- **`preprocess_audio`** (`boolean`, _optional_):
  Whether to preprocess the audio before using it as conditioning. Default value: `true`
  - Default: `true`


**Required Parameters Example**:

```json
{
  "prompt": "A woman speaks to the camera",
  "audio_url": "https://storage.googleapis.com/falserverless/example_inputs/ltx-2-a2v-input-audio.mp3",
  "loras": [
    {
      "path": "",
      "scale": 1
    }
  ]
}
```

**Full Example**:

```json
{
  "prompt": "A woman speaks to the camera",
  "audio_url": "https://storage.googleapis.com/falserverless/example_inputs/ltx-2-a2v-input-audio.mp3",
  "image_url": "https://storage.googleapis.com/falserverless/example_inputs/ltx-2-a2v-input-image.png",
  "match_audio_length": true,
  "num_frames": 121,
  "video_size": "landscape_4_3",
  "use_multiscale": true,
  "fps": 25,
  "acceleration": "none",
  "camera_lora": "none",
  "camera_lora_scale": 1,
  "negative_prompt": "blurry, out of focus, overexposed, underexposed, low contrast, washed out colors, excessive noise, grainy texture, poor lighting, flickering, motion blur, distorted proportions, unnatural skin tones, deformed facial features, asymmetrical face, missing facial features, extra limbs, disfigured hands, wrong hand count, artifacts around text, inconsistent perspective, camera shake, incorrect depth of field, background too sharp, background clutter, distracting reflections, harsh shadows, inconsistent lighting direction, color banding, cartoonish rendering, 3D CGI look, unrealistic materials, uncanny valley effect, incorrect ethnicity, wrong gender, exaggerated expressions, wrong gaze direction, mismatched lip sync, silent or muted audio, distorted voice, robotic voice, echo, background noise, off-sync audio,incorrect dialogue, added dialogue, repetitive speech, jittery movement, awkward pauses, incorrect timing, unnatural transitions, inconsistent framing, tilted camera, flat lighting, inconsistent tone, cinematic oversaturation, stylized filters, or AI artifacts.",
  "enable_safety_checker": true,
  "video_output_type": "X264 (.mp4)",
  "video_quality": "high",
  "video_write_mode": "balanced",
  "loras": [
    {
      "path": "",
      "scale": 1
    }
  ],
  "image_strength": 1,
  "audio_strength": 1,
  "preprocess_audio": true
}
```


### Output Schema

The API returns the following output format:

- **`video`** (`VideoFile`, _required_):
  The generated video.
  - Examples: {"file_name":"ltx-2-a2v-output.mp4","content_type":"video/mp4","url":"https://storage.googleapis.com/falserverless/example_outputs/ltx-2-a2v-output.mp4"}

- **`seed`** (`integer`, _required_):
  The seed used for the random number generator.
  - Examples: 175932751

- **`prompt`** (`string`, _required_):
  The prompt used for the generation.
  - Examples: "A woman speaks to the camera"


**Example Response**:

```json
{
  "video": {
    "file_name": "ltx-2-a2v-output.mp4",
    "content_type": "video/mp4",
    "url": "https://storage.googleapis.com/falserverless/example_outputs/ltx-2-a2v-output.mp4"
  },
  "seed": 175932751,
  "prompt": "A woman speaks to the camera"
}
```


## Usage Examples

### cURL

```bash
curl --request POST \
  --url https://fal.run/fal-ai/ltx-2-19b/distilled/audio-to-video/lora \
  --header "Authorization: Key $FAL_KEY" \
  --header "Content-Type: application/json" \
  --data '{
     "prompt": "A woman speaks to the camera",
     "audio_url": "https://storage.googleapis.com/falserverless/example_inputs/ltx-2-a2v-input-audio.mp3",
     "loras": [
       {
         "path": "",
         "scale": 1
       }
     ]
   }'
```

### Python

Ensure you have the Python client installed:

```bash
pip install fal-client
```

Then use the API client to make requests:

```python
import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
           print(log["message"])

result = fal_client.subscribe(
    "fal-ai/ltx-2-19b/distilled/audio-to-video/lora",
    arguments={
        "prompt": "A woman speaks to the camera",
        "audio_url": "https://storage.googleapis.com/falserverless/example_inputs/ltx-2-a2v-input-audio.mp3",
        "loras": [{
            "path": "",
            "scale": 1
        }]
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)
print(result)
```

### JavaScript

Ensure you have the JavaScript client installed:

```bash
npm install --save @fal-ai/client
```

Then use the API client to make requests:

```javascript
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/ltx-2-19b/distilled/audio-to-video/lora", {
  input: {
    prompt: "A woman speaks to the camera",
    audio_url: "https://storage.googleapis.com/falserverless/example_inputs/ltx-2-a2v-input-audio.mp3",
    loras: [{
      path: "",
      scale: 1
    }]
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});
console.log(result.data);
console.log(result.requestId);
```


## Additional Resources

### Documentation

- [Model Playground](https://fal.ai/models/fal-ai/ltx-2-19b/distilled/audio-to-video/lora)
- [API Documentation](https://fal.ai/models/fal-ai/ltx-2-19b/distilled/audio-to-video/lora/api)
- [OpenAPI Schema](https://fal.ai/api/openapi/queue/openapi.json?endpoint_id=fal-ai/ltx-2-19b/distilled/audio-to-video/lora)

### fal.ai Platform

- [Platform Documentation](https://docs.fal.ai)
- [Python Client](https://docs.fal.ai/clients/python)
- [JavaScript Client](https://docs.fal.ai/clients/javascript)