# LTX 2.3 Trainer (V2) - Keyframe Interpolation

> Train a LoRA that generates the video between keyframes — supply first/last (and optional middle) frames at inference and the model fills the in-between motion.


## Overview

- **Endpoint**: `https://fal.run/fal-ai/ltx23-trainer-v2/interpolate`
- **Model ID**: `fal-ai/ltx23-trainer-v2/interpolate`
- **Category**: training
- **Kind**: training


## Pricing

The cost of training depends on the number of steps. The formula is: 0.0024 * steps. With 1000 steps, your request will cost **$2.40**.

For more details, see [fal.ai pricing](https://fal.ai/pricing).

## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:


- **`training_data_url`** (`string`, _required_):
  URL to a `.zip` archive of your training data. The exact file layout depends on the training mode — see this endpoint's documentation for the required structure. You can include a `.txt` caption alongside each media file (same base name).

- **`rank`** (`RankEnum`, _optional_):
  The rank of the LoRA adaptation. Higher values increase capacity but use more memory. Default value: `"32"`
  - Default: `32`
  - Options: `8`, `16`, `32`, `64`, `128`
  - Examples: 32

- **`number_of_steps`** (`integer`, _optional_):
  The number of training steps. Default value: `2000`
  - Default: `2000`
  - Range: `100` to `20000`
  - Examples: 2000

- **`learning_rate`** (`float`, _optional_):
  Learning rate for optimization. Default value: `0.0002`
  - Default: `0.0002`
  - Range: `0.000001` to `1`

- **`number_of_frames`** (`integer`, _optional_):
  Number of frames per training sample. Must satisfy frames % 8 == 1 (e.g., 1, 9, 17, 25, 33, 41, 49, 57, 65, 73, 81, 89, 97). Default value: `89`
  - Default: `89`
  - Range: `9` to `121`
  - Examples: 89

- **`frame_rate`** (`integer`, _optional_):
  Target frames per second for the training video (LTX-2.3 native is 24). Default value: `24`
  - Default: `24`
  - Range: `8` to `60`
  - Examples: 24

- **`resolution`** (`ResolutionEnum`, _optional_):
  Resolution to use for training. Higher resolutions require more memory. Default value: `"medium"`
  - Default: `"medium"`
  - Options: `"low"`, `"medium"`, `"high"`
  - Examples: "medium"

- **`aspect_ratio`** (`AspectRatioEnum`, _optional_):
  Aspect ratio to use for training. Default value: `"1:1"`
  - Default: `"1:1"`
  - Options: `"16:9"`, `"1:1"`, `"9:16"`
  - Examples: "1:1"

- **`trigger_phrase`** (`string`, _optional_):
  A phrase that will trigger the LoRA style. Will be prepended to captions during training. Default value: `""`
  - Default: `""`
  - Examples: ""

- **`auto_scale_input`** (`boolean`, _optional_):
  If true, videos will be automatically scaled to the target frame count and fps. This option has no effect on image datasets.
  - Default: `false`
  - Examples: false

- **`split_input_into_scenes`** (`boolean`, _optional_):
  If true, videos above a certain duration threshold will be split into scenes. Default value: `true`
  - Default: `true`
  - Examples: true

- **`split_input_duration_threshold`** (`float`, _optional_):
  The duration threshold in seconds. If a video is longer than this, it will be split into scenes. Default value: `30`
  - Default: `30`
  - Range: `1` to `60`
  - Examples: 30

- **`debug_dataset`** (`boolean`, _optional_):
  When enabled, the trainer returns a downloadable archive of your preprocessed training data for manual inspection. Use this to verify that your videos, images, and captions were processed correctly before committing to a full training run.
  - Default: `false`

- **`validation`** (`list<InterpolateValidation>`, _optional_):
  A list of validation inputs, each with a prompt and keyframe images.
  - Default: `[]`
  - Array of InterpolateValidation

- **`validation_negative_prompt`** (`string`, _optional_):
  A negative prompt to use for validation. Note: validation previews are single-stage approximations of the production two-stage (distilled) inference, so preview quality and guidance differ from final inference. Default value: `"blurry, out of focus, overexposed, underexposed, low contrast, washed out colors, excessive noise, grainy texture, poor lighting, flickering, motion blur, distorted proportions, unnatural skin tones, deformed facial features, asymmetrical face, missing facial features, extra limbs, disfigured hands, wrong hand count, artifacts around text, inconsistent perspective, camera shake, incorrect depth of field, background too sharp, background clutter, distracting reflections, harsh shadows, inconsistent lighting direction, color banding, cartoonish rendering, 3D CGI look, unrealistic materials, uncanny valley effect, incorrect ethnicity, wrong gender, exaggerated expressions, wrong gaze direction, mismatched lip sync, silent or muted audio, distorted voice, robotic voice, echo, background noise, off-sync audio, incorrect dialogue, added dialogue, repetitive speech, jittery movement, awkward pauses, incorrect timing, unnatural transitions, inconsistent framing, tilted camera, flat lighting, inconsistent tone, cinematic oversaturation, stylized filters, or AI artifacts."`
  - Default: `"blurry, out of focus, overexposed, underexposed, low contrast, washed out colors, excessive noise, grainy texture, poor lighting, flickering, motion blur, distorted proportions, unnatural skin tones, deformed facial features, asymmetrical face, missing facial features, extra limbs, disfigured hands, wrong hand count, artifacts around text, inconsistent perspective, camera shake, incorrect depth of field, background too sharp, background clutter, distracting reflections, harsh shadows, inconsistent lighting direction, color banding, cartoonish rendering, 3D CGI look, unrealistic materials, uncanny valley effect, incorrect ethnicity, wrong gender, exaggerated expressions, wrong gaze direction, mismatched lip sync, silent or muted audio, distorted voice, robotic voice, echo, background noise, off-sync audio, incorrect dialogue, added dialogue, repetitive speech, jittery movement, awkward pauses, incorrect timing, unnatural transitions, inconsistent framing, tilted camera, flat lighting, inconsistent tone, cinematic oversaturation, stylized filters, or AI artifacts."`

- **`validation_number_of_frames`** (`integer`, _optional_):
  The number of frames in validation videos. Default value: `89`
  - Default: `89`
  - Range: `9` to `121`
  - Examples: 89

- **`validation_frame_rate`** (`integer`, _optional_):
  Target frames per second for validation videos (LTX-2.3 native is 24). Default value: `24`
  - Default: `24`
  - Range: `8` to `60`
  - Examples: 24

- **`validation_resolution`** (`ValidationResolutionEnum`, _optional_):
  The resolution to use for validation. Default value: `"high"`
  - Default: `"high"`
  - Options: `"low"`, `"medium"`, `"high"`
  - Examples: "high"

- **`validation_aspect_ratio`** (`ValidationAspectRatioEnum`, _optional_):
  The aspect ratio to use for validation. Default value: `"1:1"`
  - Default: `"1:1"`
  - Options: `"16:9"`, `"1:1"`, `"9:16"`
  - Examples: "1:1"

- **`stg_scale`** (`float`, _optional_):
  STG (Spatio-Temporal Guidance) scale. 0.0 disables STG. Recommended value is 1.0. Default value: `1`
  - Default: `1`
  - Range: `0` to `3`

- **`include_middle_keyframe`** (`boolean`, _optional_):
  Also keep a middle keyframe (first+middle+last -> video) instead of just first+last. When true, every validation sample must provide middle_image_url.
  - Default: `false`


**Required Parameters Example**:

```json
{
  "training_data_url": ""
}
```

**Full Example**:

```json
{
  "training_data_url": "",
  "rank": 32,
  "number_of_steps": 2000,
  "learning_rate": 0.0002,
  "number_of_frames": 89,
  "frame_rate": 24,
  "resolution": "medium",
  "aspect_ratio": "1:1",
  "trigger_phrase": "",
  "auto_scale_input": false,
  "split_input_into_scenes": true,
  "split_input_duration_threshold": 30,
  "validation": [],
  "validation_negative_prompt": "blurry, out of focus, overexposed, underexposed, low contrast, washed out colors, excessive noise, grainy texture, poor lighting, flickering, motion blur, distorted proportions, unnatural skin tones, deformed facial features, asymmetrical face, missing facial features, extra limbs, disfigured hands, wrong hand count, artifacts around text, inconsistent perspective, camera shake, incorrect depth of field, background too sharp, background clutter, distracting reflections, harsh shadows, inconsistent lighting direction, color banding, cartoonish rendering, 3D CGI look, unrealistic materials, uncanny valley effect, incorrect ethnicity, wrong gender, exaggerated expressions, wrong gaze direction, mismatched lip sync, silent or muted audio, distorted voice, robotic voice, echo, background noise, off-sync audio, incorrect dialogue, added dialogue, repetitive speech, jittery movement, awkward pauses, incorrect timing, unnatural transitions, inconsistent framing, tilted camera, flat lighting, inconsistent tone, cinematic oversaturation, stylized filters, or AI artifacts.",
  "validation_number_of_frames": 89,
  "validation_frame_rate": 24,
  "validation_resolution": "high",
  "validation_aspect_ratio": "1:1",
  "stg_scale": 1
}
```


### Output Schema

The API returns the following output format:

- **`video`** (`File`, _optional_):
  Combined validation video preview (video-producing endpoints), if any.

- **`audio`** (`File`, _optional_):
  Combined validation audio preview for the audio-only-output endpoints (/v2a, /a2a, /t2a, /audio-extend-prefix, /audio-extend-suffix, /audio-inpaint), if any.

- **`lora_file`** (`File`, _required_):
  URL to the trained LoRA weights (.safetensors).

- **`config_file`** (`File`, _required_):
  Configuration used for setting up inference endpoints.

- **`debug_dataset`** (`File`, _optional_):
  Downloadable archive of the preprocessed training data, when debug_dataset is enabled.


**Example Response**:

```json
{
  "lora_file": {
    "url": "",
    "content_type": "image/png",
    "file_name": "z9RV14K95DvU.png",
    "file_size": 4404019
  },
  "config_file": {
    "url": "",
    "content_type": "image/png",
    "file_name": "z9RV14K95DvU.png",
    "file_size": 4404019
  }
}
```


## Usage Examples

### cURL

```bash
curl --request POST \
  --url https://fal.run/fal-ai/ltx23-trainer-v2/interpolate \
  --header "Authorization: Key $FAL_KEY" \
  --header "Content-Type: application/json" \
  --data '{
     "training_data_url": ""
   }'
```

### Python

Ensure you have the Python client installed:

```bash
pip install fal-client
```

Then use the API client to make requests:

```python
import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
           print(log["message"])

result = fal_client.subscribe(
    "fal-ai/ltx23-trainer-v2/interpolate",
    arguments={
        "training_data_url": ""
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)
print(result)
```

### JavaScript

Ensure you have the JavaScript client installed:

```bash
npm install --save @fal-ai/client
```

Then use the API client to make requests:

```javascript
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/ltx23-trainer-v2/interpolate", {
  input: {
    training_data_url: ""
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});
console.log(result.data);
console.log(result.requestId);
```


## Additional Resources

### Documentation

- [Model Playground](https://fal.ai/models/fal-ai/ltx23-trainer-v2/interpolate)
- [API Documentation](https://fal.ai/models/fal-ai/ltx23-trainer-v2/interpolate/api)
- [OpenAPI Schema](https://fal.ai/api/openapi/queue/openapi.json?endpoint_id=fal-ai/ltx23-trainer-v2/interpolate)

### fal.ai Platform

- [Platform Documentation](https://docs.fal.ai)
- [Python Client](https://docs.fal.ai/clients/python)
- [JavaScript Client](https://docs.fal.ai/clients/javascript)