Run the latest models all in one Sandbox 🏖️

LTX Video 2.0 Pro Image to Video Developer Guide

Explore all models

LTX Video 2.0 Pro transforms static images into 4K video with synchronized audio through fal's API, generating production-ready content in seconds instead of hours.

last updated
1/7/2026
edited by
Zachary Roth
read time
6 minutes
LTX Video 2.0 Pro Image to Video Developer Guide

Generating Video from Static Images

LTX Video 2.0 Pro generates 4K video with synchronized audio from a single input image at $0.06 to $0.24 per second depending on resolution. The model's transformer-based architecture achieves a 1:192 compression ratio through spatiotemporal downscaling, enabling full self-attention across frames while maintaining temporal coherence.1

This guide covers fal API integration from authentication through production deployment, including pricing, response handling, and optimization strategies.

Pricing and Model Variants

fal offers two LTX-2 image-to-video endpoints with different cost and capability tradeoffs:

ModelEndpointCost/SecondDurationUse Case
Profal-ai/ltx-2/image-to-video$0.06 (1080p), $0.12 (1440p), $0.24 (2160p)6-10sProduction quality with audio
Fastfal-ai/ltx-2/image-to-video/fast$0.04 (1080p), $0.08 (1440p), $0.16 (2160p)6-20sRapid iteration, prototyping

A 6-second 1080p video costs $0.36 with the Pro model. The Fast model supports durations up to 20 seconds, but clips longer than 10 seconds require 25 FPS and 1080p resolution.

falMODEL APIs

The fastest, cheapest and most reliable way to run genAI models. 1 API, 100s of models

falSERVERLESS

Scale custom models and apps to thousands of GPUs instantly

falCOMPUTE

A fully controlled GPU cloud for enterprise AI training + research

API Setup

Store your API key as an environment variable:

export FAL_KEY="your-api-key-here"

Install the client library:

npm install --save @fal-ai/client  # JavaScript
pip install fal-client              # Python

Request Parameters

The endpoint fal-ai/ltx-2/image-to-video accepts two required parameters: image_url (publicly accessible URL or base64 data URI supporting PNG, JPEG, WebP, AVIF, HEIF) and prompt (text describing motion and scene dynamics).

ParameterOptionsDefault
duration6, 8, 106
resolution1080p, 1440p, 2160p1080p
aspect_ratio16:916:9
fps25, 5025
generate_audiotrue, falsetrue
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/ltx-2/image-to-video", {
  input: {
    image_url: "https://your-domain.com/image.jpg",
    prompt:
      "The camera slowly dollies in as gentle ripples move across the water surface",
    duration: 6,
    resolution: "1080p",
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});

Response Schema

The API returns video metadata:

{
  "video": {
    "url": "https://v3.fal.media/files/...",
    "content_type": "video/mp4",
    "file_name": "ltxv-2-i2v-output.mp4"
  }
}

The requestId returned by fal.subscribe enables status polling and debugging. Video URLs are temporary; persist outputs to your own storage for production use.

Prompt Structure

Effective prompts describe camera movement and subject motion rather than static scene descriptions. The model responds well to cinematographic language:

  • Camera movement: "dollies in," "pans left," "tracking shot," "static wide shot"
  • Subject motion: "walks slowly," "hair blowing in wind," "ripples across surface"
  • Temporal pacing: "gradually," "sudden," "continuous"

Ineffective: "A woman in a denim jacket on a city street at night"

Effective: "The camera slowly dollies in toward her face as people blur past, city lights flicker and reflections shift across her denim jacket"

Avoid prompts that redescribe the input image. Focus on what should change.

Async and Webhook Integration

For high-volume applications, use the queue API with webhooks instead of blocking:

const { request_id } = await fal.queue.submit("fal-ai/ltx-2/image-to-video", {
  input: { image_url, prompt },
  webhookUrl: "https://your-domain.com/webhook",
});

The webhook receives the complete response payload when generation finishes. Use request_id to correlate webhook responses with original requests.

Error Handling

Common failure modes:

  • Invalid image URL: Image not publicly accessible or returns non-image content type
  • Unsupported format: Format not in PNG, JPEG, WebP, AVIF, HEIF
  • Rate limiting: Too many concurrent requests from your API key

Constraints

The API currently supports only 16:9 aspect ratio. Input images are cropped to fit so plan compositions accordingly. When generate_audio is set to false, the output video contains no audio track.

Cost Optimization

Resolution dominates cost. A 10-second 4K video costs $2.40, while the same duration at 1080p costs $0.60. For preview workflows, generate at 1080p first, then regenerate approved outputs at higher resolutions.

Cache generated videos with their input parameters (image hash + prompt) to avoid regenerating identical content.

Production Checklist

Before deploying:

  • Validate image URLs return 200 status before submitting to avoid wasted API calls
  • Implement webhook handlers for async queue results
  • Store videos in your own CDN; fal URLs are temporary
  • Log requestId for debugging failed generations
  • Set up cost alerts based on expected usage volume

The fal platform scales automatically. The same integration handles both development testing and production traffic without code changes.

Next Steps

The retake-video endpoint enables selective editing of specific video segments at $0.10/second. For different quality and motion characteristics, evaluate Kling Video v2.6 or Pixverse. Complete API schema and additional examples are available in the fal documentation.

Recently Added

References

  1. HaCohen, Y., et al. "LTX-Video: Realtime Video Latent Diffusion." arXiv preprint arXiv:2501.00103, 2024. https://arxiv.org/abs/2501.00103

about the author
Zachary Roth
A generative media engineer with a focus on growth, Zach has deep expertise in building RAG architecture for complex content systems.

Related articles