Model Gallery
Fal.ai demos with unmatched AI speed
![Live Portrait - Realtime thumbnail](/demos/live-portrait.webp)
![Live Portrait - Realtime animated thumbnail](/demos/live-portrait-animated.webp)
![Lightning thumbnail](/demos/sdxl-lightning.webp)
![Lightning animated thumbnail](/demos/sdxl-lightning-animated.webp)
![Dynamic by Fal thumbnail](/demos/dynamic.webp)
![Dynamic by Fal animated thumbnail](/demos/dynamic-animated.webp)
Explore Models
AuraFlow
Fully open flow based text to image model
Stable Diffusion V3
Run SD3 at the speed of light
Stable Diffusion XL
Run SDXL at the speed of light
Stable Diffusion with LoRAs
Run Any Stable Diffusion model with customizable LoRA weights.
AuraSR
Upscale your images with AuraSR.
Stable Cascade
Stable Cascade: Image generation on a smaller & cheaper latent space.
High Quality Stable Video Diffusion
Generate short video clips from your images using SVD v1.1
Birefnet Background Removal
bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS)
Creative Upscaler
Create creative upscaled images.
Clarity Upscaler
Clarity upscaler for images with high fidelity.
CCSR Upscaler
SOTA Image Upscaler
Stable Diffusion Turbo (v1.5/XL)
Run SDXL at the speed of light
Latent Consistency Models (v1.5/XL)
Run SDXL at the speed of light
Whisper
Whisper is a model for speech transcription and translation.
Wizper (Whisper v3 -- fal.ai edition)
[Experimental] Whisper v3 Large -- but optimized by our inference wizards. Same WER, double the performance!
Stable Diffusion XL Lightning
Run SDXL at the speed of light
Hyper SDXL
Hyper-charge SDXL's performance and creativity.
Playground v2.5
State-of-the-art open-source model in aesthetic quality
AMT Interpolation
Interpolate between video frames
T2V Turbo - Video Crafter
Generate short video clips from your prompts
SD 1.5 Depth ControlNet
SD 1.5 ControlNet
ControlNet Tile Upscaler
ControlNet Tile Upscaler
PhotoMaker
Customizing Realistic Human Photos via Stacked ID Embedding
Latent Consistency (SDXL & SDv1.5)
Produce high-quality images with minimal inference steps.
Optimized Latent Consistency (SDv1.5)
Produce high-quality images with minimal inference steps. Optimized for 512x512 input image size.
Fooocus
Default parameters with automated optimizations and quality improvements.
InstantID
Zero-shot Identity-Preserving Generation in Seconds
AnimateDiff Video-to-Video Evolved
Re-animate your videos with evolved consistency!
AnimateDiff
Animate your ideas!
AnimateDiff Turbo
Animate your ideas in lightning speed!
MetaVoice
MetaVoice-1B is a 1.2B parameter base model trained on 100K hours of speech for TTS (text-to-speech).
Illusion Diffusion
Create illusions conditioned on image.
Segment Anything Model
SAM.
TinySAM Distilled Segment Anything Model
TinySAM.
Midas Depth Estimation
Create depth maps using Midas depth estimation.
Remove Background
Remove the background from an image.
Upscale Images
Upscale images by a given factor.
ControlNet SDXL
Generate Images with ControlNet.
Inpainting sdxl and sd
Inpaint images with SD and SDXL
Animatediff SparseCtrl LCM
Animate Your Drawings with Latent Consistency Models!
Swap Face
Swap a face between two images.
PuLID
Tuning-free ID customization.
IP Adapter Face ID
High quality zero-shot personalization
Marigold Depth Estimation
Create depth maps using Marigold depth estimation.
XTTS
Stable Audio Open
Open source text-to-audio model.
DiffusionEdge
Diffusion based high quality edge detection
Stable Diffusion XL Image to Image with LoRAs
Run Stable Diffusion XL with customizable LoRA weights.
TripoSR
State of the art Image to 3D Object generation
Face Retoucher
Automatically retouches faces to smooth skin and remove blemishes.
LLaVA v1.5 13B
Vision
LLaVA v1.6 34B
Vision
NSFW Filter
Predict the probability of an image being NSFW.
SUPIR Upscaler
A Powerful Image Upscaler
Face to Sticker
Create stickers from faces.
Moondream
Answer questions from the images.
Sad Talker
Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Stable Diffusion with LoRAs
Run Any Stable Diffusion model with customizable LoRA weights.
Stable Diffusion XL
Run SDXL at the speed of light
Stable Diffusion XL
Run SDXL at the speed of light
Dreamshaper
Dreamshaper model.
Realistic Vision
Generate realistic images.
Lightning Models
Collection of SDXL Lightning models.
Omni Zero
Any pose, any style, any identity
LLava Phi 3 Mini
A LLaVA model fine-tuned from microsoft/Phi-3-mini-4k-instruct and CLIP-ViT-Large-patch14-336 with ShareGPT4V-PT and InternVL-SFT by XTuner.
Lipsync
A lipsync model that synchronizes speech to face movements.
Qwen VL Chat 7B Int4
A visual multimodal version of the large model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-VL accepts image, text, and bounding box as inputs, outputs text and bounding box.
LLaVA Llama3 8B
A model fine-tuned from meta-llama/Meta-Llama-3-8B-Instruct and CLIP-ViT-Large-patch14-336 with LLaVA-Pretrain and LLaVA-Instruct by XTuner.
DWPose Pose Prediction
Predict poses.
SoteDiffusion
Anime finetune of Würstchen V3.
Florence-2 Large
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
Live Portrait
Transfer expression from a video to a portrait.
MusePose
Animate a reference image with a driving video using MusePose.
LaMa
Remove objects from an image using a mask