LTX Video 2.0 Pro Text to Video
Input
Customize your input with more control.
Result
What would you like to do next?
Your request will cost $0.06 per second for 1080p, $0.12 per second for 1440p or $0.24 per second for 2160p.
Logs
LTX Video 2.0 Pro | [text-to-video]
Lightricks' LTXV-13B delivers high-fidelity video with synchronized audio at $0.06-$0.24 per second, processing 30x faster than comparable video generation systems through multiscale rendering. The model makes professional video production accessible on standard hardware without requiring enterprise GPU setups, targeting marketing teams and content creators who need 4K output at production scale.
Use Cases: Marketing content creation | Social media video production | Concept visualization and prototyping
Performance
LTX Video 2.0 Pro positions as a premium text-to-video solution with tiered pricing that scales with resolution needs, roughly 10x more cost-effective than enterprise alternatives when factoring in infrastructure requirements.
| Metric | Result | Context |
|---|---|---|
| Processing Speed | 30x faster than competitors | Multiscale rendering processes structure before detail vs traditional frame-by-frame diffusion |
| Cost per Second | $0.06 (1080p) / $0.12 (1440p) / $0.24 (2160p) | 16.7, 8.3, or 4.2 seconds per $1.00 respectively |
| Max Duration | 6-10 seconds | Configurable in 2-second increments |
| Resolution Range | 1080p to 2160p (4K) | 16:9 aspect ratio at 25-50 fps |
| Audio Generation | Synchronized soundtrack | Automatic audio synthesis included by default |
| Related Endpoints | LTX Video 2.0 Fast | Speed-optimized variant for rapid iteration workflows |
Cinematic Quality at Production Scale
LTXV-13B uses multiscale rendering to generate video progressively, starting with low-resolution structure and refining detail in passes rather than processing every frame at full resolution simultaneously. This architectural choice means you get 4K output without the typical GPU memory bottlenecks.
What this means for you:
-
Synchronized audio-visual output: Generate video with matching soundtrack in a single inference, no separate audio generation or manual syncing required
-
Resolution flexibility: Scale from 1080p for social content ($0.06/sec) to 4K for broadcast-quality output ($0.24/sec) based on actual delivery requirements
-
Extended duration control: Configure 6, 8, or 10-second sequences with frame rate options at 25 or 50 fps for different motion quality needs
-
Hardware accessibility: Run professional video generation on consumer GPUs through efficient multiscale processing instead of requiring enterprise infrastructure
Technical Specifications
| Spec | Details |
|---|---|
| Architecture | LTXV-13B |
| Input Formats | Text prompts with cinematic descriptors |
| Output Formats | MP4 video with audio (video/mp4) |
| Resolution Options | 1080p, 1440p, 2160p at 16:9 aspect ratio |
| Frame Rate | 25 fps or 50 fps configurable |
| Duration Range | 6-10 seconds per generation |
| License | Commercial use permitted (Partner tier) |
API Documentation | Quickstart Guide | Enterprise Pricing
How It Stacks Up
LTX Video 2.0 Fast – LTX Video 2.0 Pro trades inference speed for maximum resolution and audio capabilities at higher cost per second. The Fast variant prioritizes rapid iteration for preview workflows where 4K output and synchronized audio aren't required.
Hunyuan Video V1.5 – LTX Video 2.0 Pro emphasizes hardware efficiency through multiscale rendering for consumer GPU deployment. Hunyuan Video V1.5 focuses on extended duration capabilities and different motion dynamics for narrative-driven video sequences.
Sora 2 (OpenAI) – LTX Video 2.0 Pro delivers 4K output with audio in 6-10 second sequences optimized for marketing and social content. Sora targets longer-form video up to 60 seconds with emphasis on physical world simulation for complex scene understanding.