Run the latest models all in one Sandbox 🏖️

LTX Video 2.0 Fast Image to Video

fal-ai/ltxv-2/image-to-video/fast
Create high-fidelity video with audio from images with LTX-2 Fast
Inference
Commercial use
Partner

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Your request will cost $0.04 per second for 1080p, $0.08 per second for 1440p or $0.16 per second for 2160p.

Logs

LTX Video 2.0 Fast | [image-to-video]

Lightricks' LTX Video 2.0 Fast delivers image-to-video generation with integrated audio at $0.04 per second (1080p). The model generates 6-20 second clips with synchronized audio from a single reference image and text prompt. Good for social media creators and rapid prototyping workflows where iteration speed matters more than 4K output.

Use Cases: Social Media Content | Product Demo Videos | Concept Visualization


Performance

At $0.04 per second for 1080p output ($0.08 for 1440p, $0.16 for 2160p), LTX Video 2.0 Fast is a cost-efficient image-to-video solution with native audio generation, eliminating the need for separate audio workflows.

MetricResultContext
Duration Range6-20 secondsExtended durations (12-20s) require 25 FPS + 1080p
Resolution Options1080p / 1440p / 2160pTiered pricing: $0.04 / $0.08 / $0.16 per second
Cost per Second$0.04 (1080p)25 generations per $1.00 at base resolution
Audio GenerationIntegratedNative audio synthesis from visual context
Frame Rate25 or 50 FPS50 FPS limited to durations ≤10s

Image-to-Video with Contextual Audio

LTX Video 2.0 Fast uses transformer-based video diffusion with audio conditioning, generating synchronized sound based on visual scene context rather than requiring separate audio prompts. This contrasts with traditional pipelines that treat audio as a post-processing step.

What this means for you:

  • Single-input workflow: One image URL + text prompt produces video with matching audio (street ambience, nature sounds, mechanical effects)

  • Flexible duration control: Scale from 6-second social clips to 20-second product demos without re-prompting

  • Resolution-speed tradeoffs: Choose 1080p at 50 FPS for smooth motion or 2160p at 25 FPS for detail-critical work

  • Aspect ratio locked: 16:9 output optimized for YouTube Shorts, Instagram Reels, and horizontal platforms


Technical Specifications

SpecDetails
ArchitectureLTX Video 2.0 Fast
Input FormatsPNG, JPEG, WebP, AVIF, HEIF (via URL or base64 data URI)
Output FormatsMP4 video with audio
Duration Range6-20 seconds (extended durations require 25 FPS + 1080p)
LicenseCommercial use permitted

API Documentation | Quickstart Guide | Enterprise Pricing


How It Stacks Up

Kling Video v2.6 Image to Video – LTX Video 2.0 Fast prioritizes rapid generation with integrated audio for social media workflows. Kling Video v2.6 emphasizes cinematic motion quality and extended duration support for professional video production.

MiniMax Hailuo 2.3 [Pro] – LTX Video 2.0 Fast trades maximum resolution options for faster iteration cycles at $0.04/second (1080p). MiniMax Hailuo 2.3 Pro targets high-fidelity output with extended context understanding for narrative video sequences.