Model Gallery
Veo 3
Veo 3 by Google, the most advanced AI video generation model in the world. Now available at fal with sound on!
Kling 2.1 Master
Kling 2.1 Master: The premium endpoint for Kling 2.1, designed for top-tier image-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision.
Search trends
Featured Models
Check out some of our most popular models
MiniMax Hailuo-02 Image To Video API (Standard, 768p): Advanced image-to-video generation model with 768p resolution
Veo 3 by Google, the most advanced AI video generation model in the world. With sound on!
Kling 2.1 Master: The premium endpoint for Kling 2.1, designed for top-tier image-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision.
FLUX.1 Kontext [pro] handles both text and reference images as inputs, seamlessly enabling targeted, local edits and complex transformations of entire scenes.
Google’s highest quality image generation model
Generate high quality video clips from text and image prompts using PixVerse v4.5
Generate video clips from your images using Kling 2.0 Master
Wan Effects generates high-quality videos with popular effects from images
Veo 2 creates videos from images with realistic motion and very high quality output.
All Models
Explore all available models provided by fal.ai
Wan-2.1 Pro is a premium image-to-video model that generates high-quality 1080p videos at 30fps with up to 6 seconds duration, delivering exceptional visual quality and motion diversity from images
Wan-2.1 is a image-to-video model that generates high-quality videos with high visual quality and motion diversity from images
Generate video clips from your images using Kling 1.6 (pro)
Train styles, people and other subjects at blazing speeds.
Generate natural-sounding multi-speaker dialogues, and audio. Perfect for expressive outputs, storytelling, games, animations, and interactive media.
FLUX1.1 [pro] ultra is the newest version of FLUX1.1 [pro], maintaining professional-grade image quality while delivering up to 2K resolution with improved photo realism.
Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis.
Generate video clips from your images using MiniMax Video model
MiniMax Hailuo-02 Text To Video API (Standard, 768p): Advanced video generation model with 768p resolution
Kling 2.1 Standard is a cost-efficient endpoint for the Kling 2.1 model, delivering high-quality image-to-video generation
Generate lip sync using Tavus' state-of-the-art model for high-quality synchronization.
Generate video clips from your prompts using Kling 2.0 Master
HiDream-I1 full is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.
HiDream-I1 dev is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.
HiDream-I1 fast is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within 16 steps.
FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use.
MMAudio generates synchronized audio given video and/or text inputs. It can be combined with video models to get videos with audio.
Generate high-quality images, posters, and logos with Ideogram V2. Features exceptional typography handling and realistic outputs optimized for commercial and creative use.
FLUX LoRA training optimized for portrait generation, with bright highlights, excellent prompt following and highly detailed results.
Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
Super fast endpoint for the FLUX.1 [dev] inpainting model with LoRA support, enabling rapid and high-quality image inpaingting using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.
A versatile endpoint for the FLUX.1 [dev] model that supports multiple AI extensions including LoRA, ControlNet conditioning, and IP-Adapter integration, enabling comprehensive control over image generation through various guidance methods.
Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.
FLUX.1 Image-to-Image is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
Upscale your images with AuraSR.
Clarity upscaler for upscaling images with high very fidelity.
VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.