LTX Video-0.9.7 13B Distilled Video to Video
Input
Hint: Drag and drop image files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: jpg, jpeg, png, webp, gif, avif

Hint: Drag and drop image files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: jpg, jpeg, png, webp, gif, avif

Customize your input with more control.
Result
Your request will cost $0.04 per video. For $1 you can run this model approximately 25 times.
Logs
Readme
LTX Video-0.9.7 13B Distilled [multiconditioning]
Transform your existing videos and images into stunning new visual compositions using advanced multi-conditioning AI that seamlessly blends prompts, images, and video inputs into cohesive generated content.
Overview
LTX Video-0.9.7 13B Distilled delivers powerful video-to-video transformation capabilities through a 13-billion parameter architecture optimized for multi-conditional generation. Built to handle complex creative workflows, this model excels at combining text prompts with image and video inputs to produce high-quality video outputs. Whether you're transforming existing footage, applying artistic styles, or creating hybrid compositions, the model delivers professional results without requiring specialized video editing expertise.
Key Capabilities
- Multi-conditional generation combining text, images, and video inputs
- Flexible resolution support with 480p and 720p output options
- Custom LoRA integration for specialized style and content control
- Production-ready output with configurable frame rates and compression
Popular Use Cases
Video Style Transfer and Artistic Reimagining Transform existing video footage into entirely new artistic styles by combining source videos with style reference images and descriptive prompts. Apply kaleidoscopic effects, convert realistic footage into abstract compositions, or reimagine scenes through the lens of specific artistic movements like impressionism or pixel art aesthetics.
Content Variation and A/B Testing Generate multiple variations of video content for marketing campaigns by modifying specific visual elements while maintaining core composition. Test different color palettes, backgrounds, or stylistic treatments across the same base footage to optimize engagement and identify the most effective creative direction.
Hybrid Media Composition Create unique video content by blending multiple image references with video inputs and detailed text prompts. Construct complex visual narratives that seamlessly transition between different artistic styles, combine disparate visual elements into cohesive sequences, or generate videos that match specific brand guidelines and aesthetic requirements.
Getting Started
Getting up and running with LTX Video-0.9.7 13B Distilled takes just a few minutes. Here's how to begin:
- Get your API key at https://fal.ai/login to access the model
- Install the client library for your preferred programming language
- Make your first API call to generate a video from your inputs
Technical Specifications
Model Architecture
- 13 billion parameter distilled model optimized for video generation
- Multi-conditional architecture supporting text, image, and video inputs
- Two-pass inference system for enhanced quality and detail refinement
- Custom LoRA support for specialized fine-tuning
Input Capabilities
- Resolution options: 480p and 720p output
- Aspect ratios: 9:16 (portrait), 1:1 (square), 16:9 (landscape), or automatic detection
- Frame count: Configurable up to 121 frames per generation
- Frame rate: Adjustable, default 30 fps
- Multiple image and video conditioning inputs with frame-level control
- Supported input formats: JPG, JPEG, PNG, WEBP, GIF, AVIF for images
Performance Features
- Configurable inference steps for quality-speed balance
- Built-in safety checker for content moderation
- Automatic prompt expansion using language models
- Constant rate factor (CRF) compression for optimized input processing (default: 35)
- Reverse video playback option for creative effects
Best Practices
Achieve optimal results with these proven approaches:
Layer Your Conditioning Inputs Strategically When combining multiple images or videos, consider the temporal placement and strength of each input. Start with a base image at frame 0 with higher strength (0.7-0.9) to establish the foundational composition, then introduce additional conditioning at later frames with lower strength (0.3-0.6) to guide transitions. This approach creates smoother visual evolution rather than abrupt changes. For example, when transforming a portrait into an abstract style, use your source portrait at full strength initially, then introduce style references at mid-sequence frames with reduced strength to create gradual artistic transformation.
Craft Detailed, Compositional Prompts Instead of generic descriptions like "make it artistic," construct prompts that describe the complete visual composition, color palette, artistic movement, and camera behavior. A prompt like "A vibrant, abstract composition featuring a person with outstretched arms, rendered in a kaleidoscope of colors against a deep, dark background, with intricate swirling patterns in hues of orange, yellow, blue, and green. The camera slowly zooms into the face" provides clear guidance for coherent generation across all frames.
Balance Quality and Efficiency with Inference Steps The default two-pass system uses 8 steps per pass, which balances quality and generation time effectively. For faster iterations during creative exploration, reduce to 4-6 and to 4-6. For final production renders requiring maximum detail, increase both to 10-12 steps. The (default: 1) and (default: 5) parameters control the division of labor between passes—the first pass handles broad composition while the second refines details.
Advanced Features
Custom LoRA Integration
Enhance your generations with specialized LoRA weights trained for specific styles, subjects, or artistic approaches. The model accepts multiple LoRA weights simultaneously, each with adjustable scale parameters to control their influence on the final output. This enables precise style control beyond what text prompts alone can achieve.
Video Conditioning
Beyond image inputs, you can use existing video footage as conditioning to guide motion, composition, and temporal consistency. Specify the starting frame and strength for each video input to control how it influences different segments of your generated output.
File Upload Support
For optimal performance with the fal.ai platform, you can upload your conditioning images and videos directly to fal.ai's storage system. This approach reduces latency and ensures reliable access during generation, particularly important for larger video files.
API Reference
Pricing and Usage
This model costs $0.04 per video generation. For $1, you can generate approximately 25 videos, making it cost-effective for both experimentation and production workflows. The pricing applies regardless of resolution (480p or 720p), aspect ratio, or frame count within supported limits (up to 121 frames).
For high-volume projects or enterprise deployments, contact sales for enterprise solutions with custom pricing and support options.
View complete pricing details at https://fal.ai/pricing
Support and Resources
We're here to help you succeed with LTX Video-0.9.7 13B Distilled:
- Documentation: Explore comprehensive guides and tutorials at fal.ai documentation
- API Playground: Test the model interactively and experiment with parameters on the model playground page
- API Reference: Access detailed endpoint documentation and code examples at the API documentation
- Community Support: Connect with other developers on Discord and share your creations to discover new techniques and applications
Ready to transform your video content with multi-conditional AI generation? Sign up now at https://fal.ai and start creating with LTX Video-0.9.7 13B Distilled.