fal-ai/ltx-2.3/audio-to-video

LTX-2.3 is a high-quality, fast AI video model available in Pro and Fast variants for text-to-video, image-to-video, and audio-to-video.

Learn more about LTX 2.3

Inference

Commercial use

Partner

Schema

LLMs

Playground API

Input

Audio URL*

Hint: Drag and drop audio files from your computer, audio from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: mp3, ogg, wav, m4a, aac

Image URL

Hint: Drag and drop image files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: jpg, jpeg, png, webp, gif, avif

Prompt

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Download

Your request will cost $0.10 per second.

Logs

LTX-2.3 Audio to Video

Generate video synchronized to an audio clip with LTX-2.3 by Lightricks. Provide an audio input and the model generates matching visuals with accurate timing.

What's new in 2.3

Cleaner audio-visual synchronization with improved timing accuracy
New VAE architecture for sharper fine details, textures, and facial features
Better prompt understanding for more accurate scene composition
Portrait 9:16 support in addition to landscape
24/48 FPS options for smoother motion
LoRA fine-tuning support for custom styles and characters

Pricing

$0.10 per second of generated video. Pay-per-second, no minimums.

About LTX-2.3

LTX-2.3 is the latest open-source video generation model from Lightricks, built on a DiT-based architecture. It is released under the Apache 2.0 license, which permits commercial use and fine-tuning.