fal-ai/ltx-2.3/audio-to-video
Input
Hint: Drag and drop audio files from your computer, audio from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: mp3, ogg, wav, m4a, aac
Hint: Drag and drop image files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: jpg, jpeg, png, webp, gif, avif

Customize your input with more control.
Logs
LTX-2.3 Audio to Video
Generate video synchronized to an audio clip with LTX-2.3 by Lightricks. Provide an audio input and the model generates matching visuals with accurate timing.
What's new in 2.3
- Cleaner audio-visual synchronization with improved timing accuracy
- New VAE architecture for sharper fine details, textures, and facial features
- Better prompt understanding for more accurate scene composition
- Portrait 9:16 support in addition to landscape
- 24/48 FPS options for smoother motion
- LoRA fine-tuning support for custom styles and characters
Pricing
$0.10 per second of generated video. Pay-per-second, no minimums.
About LTX-2.3
LTX-2.3 is the latest open-source video generation model from Lightricks, built on a DiT-based architecture. It is released under the Apache 2.0 license, which permits commercial use and fine-tuning.