Model Gallery

Wan-2.1 Pro is a premium image-to-video model that generates high-quality 1080p videos at 30fps with up to 6 seconds duration, delivering exceptional visual quality and motion diversity from images

image to video

motion

fal-ai/veo2/image-to-video

image-to-video

Veo 2 creates videos from images with realistic motion and very high quality output.

Wan-2.1 is a image-to-video model that generates high-quality videos with high visual quality and motion diversity from images

image to video

motion

fal-ai/kling-video/v1.6/pro/image-to-video

image-to-video

Generate video clips from your images using Kling 1.6 (pro)

fal-ai/flux-lora-fast-training

training

Train styles, people and other subjects at blazing speeds.

lora

personalization

All Models

Explore all available models provided by fal.ai

fal-ai/playai/tts/dialog

text-to-audio

Generate natural-sounding multi-speaker dialogues, and audio. Perfect for expressive outputs, storytelling, games, animations, and interactive media.

audio

fal-ai/flux-pro/v1.1-ultra

text-to-image

FLUX1.1 [pro] ultra is the newest version of FLUX1.1 [pro], maintaining professional-grade image quality while delivering up to 2K resolution with improved photo realism.

high-res

realism

fal-ai/recraft/v3/text-to-image

text-to-image

Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis.

fal-ai/minimax/video-01/image-to-video

image-to-video

Generate video clips from your images using MiniMax Video model

motion

transformation

fal-ai/tavus/hummingbird-lipsync/v0

video-to-video

Generate lip sync using Tavus' state-of-the-art model for high-quality synchronization.

fal-ai/kling-video/v2/master/text-to-video

text-to-video

Generate video clips from your prompts using Kling 2.0 Master

fal-ai/hidream-i1-full

text-to-image

HiDream-I1 full is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.

fal-ai/hidream-i1-dev

text-to-image

HiDream-I1 dev is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.

fal-ai/hidream-i1-fast

text-to-image

HiDream-I1 fast is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within 16 steps.

cassetteai/video-sound-effects-generator

video-to-video

Add sound effects to your videos

FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use.

cassetteai/music-generator

text-to-audio

CassetteAI’s model generates a 30-second sample in under 2 seconds and a full 3-minute track in under 10 seconds. At 44.1 kHz stereo audio, expect a level of professional consistency with no breaks, no squeaks, and no random interruptions in your creations.

MMAudio generates synchronized audio given video and/or text inputs. It can be combined with video models to get videos with audio.

Generate high-quality images, posters, and logos with Ideogram V2. Features exceptional typography handling and realistic outputs optimized for commercial and creative use.

realism

typography

fal-ai/flux-lora-portrait-trainer

training

FLUX LoRA training optimized for portrait generation, with bright highlights, excellent prompt following and highly detailed results.

lora

personalization

fal-ai/stable-diffusion-v35-large

text-to-image

Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.

diffusion

typography

style

fal-ai/flux-lora/inpainting

text-to-image

Super fast endpoint for the FLUX.1 [dev] inpainting model with LoRA support, enabling rapid and high-quality image inpaingting using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.

A versatile endpoint for the FLUX.1 [dev] model that supports multiple AI extensions including LoRA, ControlNet conditioning, and IP-Adapter integration, enabling comprehensive control over image generation through various guidance methods.

Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.