Model Gallery | fal.ai

Docs Blog Pricing Enterprise Careers Research Grants

Model Gallery

See all available model APIs provided by fal.ai

Can't find a model?Suggest a model

FLUX.1 Kontext

FLUX.1 Kontext is an image-to-image model that understands what's going on in your image—and lets you tweak, transform, and experiment with it like never before. You can try FLUX.1 Kontext [pro] or FLUX.1 Kontext [max] now. FLUX.1 Kontext [dev] will be available soon.

Try Kontext [pro] now Learn More

Search trends

Background Removal

Product Photography

Featured Models

Check out some of our most popular models

Sort by

fal-ai/kling-video/v2.1/master/image-to-video

Kling 2.1 Master: The premium endpoint for Kling 2.1, designed for top-tier image-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision.

fal-ai/kling-video/v2.1/standard/image-to-video

Kling 2.1 Standard is a cost-efficient endpoint for the Kling 2.1 model, delivering high-quality image-to-video generation

fal-ai/flux-pro/kontext

FLUX.1 Kontext [pro] handles both text and reference images as inputs, seamlessly enabling targeted, local edits and complex transformations of entire scenes.

fal-ai/imagen4/preview

Google’s highest quality image generation model

fal-ai/pixverse/v4.5/image-to-video

Generate high quality video clips from text and image prompts using PixVerse v4.5

fal-ai/kling-video/v2/master/image-to-video

Generate video clips from your images using Kling 2.0 Master

fal-ai/wan-effects

Wan Effects generates high-quality videos with popular effects from images

fal-ai/veo2/image-to-video

Veo 2 creates videos from images with realistic motion and very high quality output.

fal-ai/wan-pro/image-to-video

Wan-2.1 Pro is a premium image-to-video model that generates high-quality 1080p videos at 30fps with up to 6 seconds duration, delivering exceptional visual quality and motion diversity from images

All Models

Explore all available models provided by fal.ai

Wan-2.1 is a image-to-video model that generates high-quality videos with high visual quality and motion diversity from images

fal-ai/kling-video/v1.6/pro/image-to-video

Generate video clips from your images using Kling 1.6 (pro)

fal-ai/flux-lora-fast-training

Train styles, people and other subjects at blazing speeds.

personalization

fal-ai/playai/tts/dialog

Generate natural-sounding multi-speaker dialogues, and audio. Perfect for expressive outputs, storytelling, games, animations, and interactive media.

fal-ai/flux-pro/v1.1-ultra

FLUX1.1 [pro] ultra is the newest version of FLUX1.1 [pro], maintaining professional-grade image quality while delivering up to 2K resolution with improved photo realism.

fal-ai/recraft/v3/text-to-image

Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis.

fal-ai/minimax/video-01/image-to-video

Generate video clips from your images using MiniMax Video model

fal-ai/tavus/hummingbird-lipsync/v0

Generate lip sync using Tavus' state-of-the-art model for high-quality synchronization.

fal-ai/kling-video/v2/master/text-to-video

Generate video clips from your prompts using Kling 2.0 Master

fal-ai/hidream-i1-full

HiDream-I1 full is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.

fal-ai/hidream-i1-dev

HiDream-I1 dev is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.

fal-ai/hidream-i1-fast

HiDream-I1 fast is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within 16 steps.

cassetteai/video-sound-effects-generator

Add sound effects to your videos

fal-ai/flux/dev

FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use.

cassetteai/music-generator

CassetteAI’s model generates a 30-second sample in under 2 seconds and a full 3-minute track in under 10 seconds. At 44.1 kHz stereo audio, expect a level of professional consistency with no breaks, no squeaks, and no random interruptions in your creations.

fal-ai/mmaudio-v2

MMAudio generates synchronized audio given video and/or text inputs. It can be combined with video models to get videos with audio.

fal-ai/ideogram/v2

Generate high-quality images, posters, and logos with Ideogram V2. Features exceptional typography handling and realistic outputs optimized for commercial and creative use.

fal-ai/flux-lora-portrait-trainer

FLUX LoRA training optimized for portrait generation, with bright highlights, excellent prompt following and highly detailed results.

personalization

fal-ai/stable-diffusion-v35-large

Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.

fal-ai/flux-lora/inpainting

Super fast endpoint for the FLUX.1 [dev] inpainting model with LoRA support, enabling rapid and high-quality image inpaingting using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.

personalization

fal-ai/flux-general

A versatile endpoint for the FLUX.1 [dev] model that supports multiple AI extensions including LoRA, ControlNet conditioning, and IP-Adapter integration, enabling comprehensive control over image generation through various guidance methods.

fal-ai/flux-lora

Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.

personalization

fal-ai/flux/dev/image-to-image

FLUX.1 Image-to-Image is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

Upscale your images with AuraSR.

fal-ai/clarity-upscaler

Clarity upscaler for upscaling images with high very fidelity.

fal-ai/image-editing/weather-effect

Add realistic weather effects like snowfall, rain, or fog to your photos while maintaining the scene's mood.

fal-ai/image-editing/time-of-day

Transform your photos to any time of day, from golden hour to midnight, with appropriate lighting and atmosphere.

fal-ai/image-editing/style-transfer

Transform your photos into artistic masterpieces inspired by famous styles like Van Gogh's Starry Night or any artistic style you choose.

fal-ai/image-editing/scene-composition

Place your subject in any scene you imagine, from enchanted forests to urban settings, with professional composition and lighting

fal-ai/image-editing/professional-photo

Turn your casual photos into stunning professional studio portraits with perfect lighting and high-end photography style.

fal-ai/image-editing/object-removal

Remove unwanted objects or people from your photos while seamlessly blending the background.

fal-ai/image-editing/hair-change

Experiment with different hairstyles, from bald to any style you can imagine, while maintaining natural lighting and realistic results.

fal-ai/image-editing/face-enhancement

Enhance facial features with professional retouching while maintaining a natural, realistic look

fal-ai/image-editing/expression-change

Change facial expressions in photos to any emotion you desire, from smiles to serious looks.

fal-ai/image-editing/color-correction

Perfect your photos with professional color grading, balanced tones, and vibrant yet natural colors

fal-ai/image-editing/cartoonify

ransform your photos into vibrant cool cartoons with bold outlines and rich colors.

fal-ai/image-editing/background-change

Replace your photo's background with any scene you desire, from beach sunsets to urban landscapes, with perfect lighting and shadows

fal-ai/image-editing/age-progression

See how you or others might look at different ages, from younger to older, while preserving core facial features.

fal-ai/flux-pro/kontext/max/multi

Experimental version of FLUX.1 Kontext [max] with multi image handling capabilities

fal-ai/flux-pro/kontext/multi

Experimental version of FLUX.1 Kontext [pro] with multi image handling capabilities

fal-ai/hunyuan-avatar

HunyuanAvatar is a High-Fidelity Audio-Driven Human Animation model for Multiple Characters .

fal-ai/flux-pro/kontext/max

FLUX.1 Kontext [max] is a model with greatly improved prompt adherence and typography generation meet premium consistency for editing without compromise on speed.

fal-ai/flux-pro/kontext/max/text-to-image

FLUX.1 Kontext [max] text-to-image is a new premium model brings maximum performance across all aspects – greatly improved prompt adherence.

fal-ai/kling-video/v2.1/master/text-to-video

Kling 2.1 Master: The premium endpoint for Kling 2.1, designed for top-tier text-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision.

fal-ai/kling-video/v2.1/pro/image-to-video

Kling 2.1 Pro is an advanced endpoint for the Kling 2.1 model, offering professional-grade videos with enhanced visual fidelity, precise camera movements, and dynamic motion control, perfect for cinematic storytelling.

fal-ai/flux-pro/kontext/text-to-image

The FLUX.1 Kontext [pro] text-to-image delivers state-of-the-art image generation results with unprecedented prompt following, photorealistic rendering, and flawless typography.

Generate realistic lip-sync from any audio using VEED's latest model

veed/avatars/text-to-video

Generate high-quality videos with UGC-like avatars from text

veed/avatars/audio-to-video

Generate high-quality videos with UGC-like avatars from audio

fal-ai/hunyuan-portrait

HunyuanPortrait is a diffusion-based framework for generating lifelike, temporally consistent portrait animations.

fal-ai/bagel/understand

Bagel is a 7B parameter multimodal model from Bytedance-Seed that can generate both text and images.

fal-ai/bagel/edit

Bagel is a 7B parameter multimodal model from Bytedance-Seed that can generate both images and text.

Bagel is a 7B parameter from Bytedance-Seed multimodal model that can generate both text and images.

Lyria 2 is Google's latest music generation model, you can generate any type of music with this model.

fal-ai/kling-video/v1.6/standard/elements

Generate video clips from your multiple image references using Kling 1.6 (standard)

fal-ai/kling-video/v1.6/pro/elements

Generate video clips from your multiple image references using Kling 1.6 (pro)

DreamO is an image customization framework designed to support a wide range of tasks while facilitating seamless integration of multiple conditions.

fal-ai/ltx-video-13b-distilled/extend

Extend videos using LTX Video-0.9.7 13B Distilled and custom LoRA

fal-ai/ltx-video-13b-distilled/multiconditioning

Generate videos from prompts, images, and videos using LTX Video-0.9.7 13B Distilled and custom LoRA

multicondition-to-video

fal-ai/ltx-video-13b-distilled/image-to-video

Generate videos from prompts and images using LTX Video-0.9.7 13B Distilled and custom LoRA

fal-ai/ltx-video-13b-dev/multiconditioning

Generate videos from prompts, images, and videos using LTX Video-0.9.7 13B and custom LoRA

multicondition-to-video

fal-ai/ltx-video-13b-dev/extend

Extend videos using LTX Video-0.9.7 13B and custom LoRA

fal-ai/ltx-video-13b-dev/image-to-video

Generate videos from prompts and images using LTX Video-0.9.7 13B and custom LoRA

fal-ai/ltx-video-13b-dev

Generate videos from prompts using LTX Video-0.9.7 13B and custom LoRA

fal-ai/ltx-video-13b-distilled

Generate videos from prompts using LTX Video-0.9.7 13B Distilled and custom LoRA

easel-ai/easel-gifswap

Swap faces on GIFs

fal-ai/flux-lora/stream

Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.

personalization

fal-ai/ltx-video-lora/multiconditioning

Generate videos from prompts, images, and videos using LTX Video-0.9.7 and custom LoRA

multicondition-to-video

fal-ai/ltx-video-lora/image-to-video

Generate videos from prompts and images using LTX Video-0.9.7 and custom LoRA

fal-ai/pixverse/v4.5/transition

Create seamless transition between images using PixVerse v4.5

fal-ai/pixverse/v4.5/image-to-video/fast

Generate fast high quality video clips from text and image prompts using PixVerse v4.5

fal-ai/pixverse/v4.5/text-to-video/fast

Generate high quality and fast video clips from text and image prompts using PixVerse v4.5 fast

fal-ai/pixverse/v4.5/text-to-video

Generate high quality video clips from text and image prompts using PixVerse v4.5

fal-ai/pixverse/v4.5/effects

Generate high quality video clips with different effects using PixVerse v4.5

fal-ai/hunyuan-custom

HunyuanCustom revolutionizes video generation with unmatched identity consistency across multiple input types. Its innovative fusion modules and alignment networks outperform competitors, maintaining subject integrity while responding flexibly to text, image, audio, and video conditions.

fal-ai/framepack/f1

Framepack is an efficient Image-to-video model that autoregressively generates videos.

fal-ai/ace-step/audio-outpaint

Extend the beginning or end of provided audio with lyrics and/or style using ACE-Step

fal-ai/ace-step/audio-inpaint

Modify a portion of provided audio with lyrics and/or style using ACE-Step

fal-ai/ace-step/audio-to-audio

Generate music from a lyrics and example audio using ACE-Step

fal-ai/ace-step/prompt-to-audio

Generate music from a simple prompt using ACE-Step

smoretalk-ai/rembg-enhance

Rembg-enhance is optimized for 2D vector images, 3D graphics, and photos by leveraging matting technology.

background removal

high resolution

fal-ai/vidu/q1/start-end-to-video

Vidu Q1 Start-End to Video generates smooth transition 1080p videos between specified start and end images.

fal-ai/vidu/q1/text-to-video

Vidu Q1 Text to Video generates high-quality 1080p videos with exceptional visual quality and motion diversity

fal-ai/vidu/q1/image-to-video

Vidu Q1 Image to Video generates high-quality 1080p videos with exceptional visual quality and motion diversity from a single image

fal-ai/ace-step

Generate music with lyrics from text using ACE-Step

fal-ai/ltx-video-trainer

Train LTX Video 0.9.7 for custom styles and effects.

fal-ai/recraft/upscale/creative

Enhances a given raster image using the 'creative upscale' tool, increasing image resolution, making the image sharper and cleaner.

fal-ai/recraft/upscale/crisp

Enhances a given raster image using 'crisp upscale' tool, boosting resolution with a focus on refining small details and faces.

fal-ai/recraft/v3/create-style

Recraft V3 Create Style is capable of creating unique styles for Recraft V3 based on your images.

personalization

fal-ai/recraft/v3/image-to-image

Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis.

easel-ai/easel-avatar

Create scenes with one or two people using just selfies and text prompt (without LoRAs)

image-generation

fal-ai/minimax/voice-clone

Clone a voice from a sample audio and generate speech from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality text-to-speech.

fal-ai/minimax/speech-02-turbo

Generate fast speech from text prompts and different voices using the MiniMax Speech-02 Turbo model, which leverages advanced AI techniques to create high-quality text-to-speech.

fal-ai/minimax/speech-02-hd

Generate speech from text prompts and different voices using the MiniMax Speech-02 HD model, which leverages advanced AI techniques to create high-quality text-to-speech.

fal-ai/minimax/image-01/subject-reference

Generate images from text and a reference image using MiniMax Image-01 for consistent character appearance.

fal-ai/minimax/image-01

Generate high quality images from text prompts using MiniMax Image-01. Longer text prompts will result in better quality images.

fal-ai/hidream-i1-full/image-to-image

HiDream-I1 full is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.

fal-ai/trellis/multi

Generate 3D models from multiple images using Trellis. A native 3D generative model enabling versatile and high-quality 3D asset creation.

fal-ai/ideogram/v3/reframe

Extend existing images with Ideogram V3's reframe feature. Create expanded versions and adaptations while preserving main image and adding new creative directions through prompt guidance.

fal-ai/ideogram/v3

Generate high-quality images, posters, and logos with Ideogram V3. Features exceptional typography handling and realistic outputs optimized for commercial and creative use.

fal-ai/ideogram/v3/replace-background

Replace backgrounds existing images with Ideogram V3's replace background feature. Create variations and adaptations while preserving core elements and adding new creative directions through prompt guidance.

fal-ai/ideogram/v3/remix

Reimagine existing images with Ideogram V3's remix feature. Create variations and adaptations while preserving core elements and adding new creative directions through prompt guidance.

fal-ai/ideogram/v3/edit

Transform existing images with Ideogram V3's editing capabilities. Modify, adjust, and refine images while maintaining high fidelity and realistic outputs with precise prompt control.

fal-ai/hidream-e1-full

Edit images with natural language

fal-ai/f-lite/standard

F Lite is a 10B parameter diffusion model created by Fal and Freepik, trained exclusively on copyright-safe and SFW content.

fal-ai/f-lite/texture

F Lite is a 10B parameter diffusion model created by Fal and Freepik, trained exclusively on copyright-safe and SFW content. This is a high texture density variant of the model.

fal-ai/moondream2/visual-query

Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint.

fal-ai/moondream2

Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint.

fal-ai/moondream2/point-object-detection

Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint.

fal-ai/moondream2/object-detection

Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint.

fal-ai/step1x-edit

Step1X-Edit transforms your photos with simple instructions into stunning, professional-quality edits—rivaling top proprietary tools.

fal-ai/image2svg

Image2SVG transforms raster images into clean vector graphics, preserving visual quality while enabling scalable, customizable SVG outputs with precise control over detail levels.

An AI model that transforms input images into new ones based on text prompts, blending reference visuals with your creative directions.

fal-ai/gpt-image-1/edit-image/byok

OpenAI's latest image generation and editing model: gpt-1-image. Currently powered with bring-your-own-key.

fal-ai/gpt-image-1/text-to-image/byok

OpenAI's latest image generation and editing model: gpt-1-image. Currently powered with bring-your-own-key.

fal-ai/magi/extend-video

MAGI-1 extends videos with an exceptional understanding of physical interactions and prompts

MAGI-1 is a video generation model with exceptional understanding of physical interactions and cinematic prompts

fal-ai/magi/image-to-video

MAGI-1 generates videos from images with exceptional understanding of physical interactions and prompting

fal-ai/pixverse/v4/effects

Generate high quality video clips with different effects using PixVerse v4

fal-ai/magi-distilled/extend-video

MAGI-1 distilled extends videos faster with an exceptional understanding of physical interactions and prompts

fal-ai/magi-distilled/image-to-video

MAGI-1 distilled generates videos faster from images with exceptional understanding of physical interactions and prompting

fal-ai/dia-tts/voice-clone

Clone dialog voices from a sample audio and generate dialogs from text prompts using the Dia TTS which leverages advanced AI techniques to create high-quality text-to-speech.

fal-ai/framepack/flf2v

Framepack is an efficient Image-to-video model that autoregressively generates videos.

Dia directly generates realistic dialogue from transcripts. Audio conditioning enables emotion control. Produces natural nonverbals like laughter and throat clearing.

fal-ai/magi-distilled

MAGI-1 distilled is a faster video generation model with exceptional understanding of physical interactions and cinematic prompts

fal-ai/smart-turn

An open source, community-driven and native audio turn detection model by Pipecat AI.

rundiffusion-fal/juggernaut-flux-lora/inpainting

Juggernaut Base Flux LoRA Inpainting by RunDiffusion is a drop-in replacement for Flux [Dev] inpainting that delivers sharper details, richer colors, and enhanced realism to all your LoRAs and LyCORIS with full compatibility.

fal-ai/fashn/tryon/v1.5

FASHN v1.5 delivers precise virtual try-on capabilities, accurately rendering garment details like text and patterns at 576x864 resolution from both on-model and flat-lay photo references.

fal-ai/plushify

Turn any image into a cute plushie!

fal-ai/instant-character

InstantCharacter creates high-quality, consistent characters from text prompts, supporting diverse poses, styles, and appearances with strong identity control.

personalization

fal-ai/wan-flf2v

Wan-2.1 flf2v generates dynamic videos by intelligently bridging a given first frame to a desired end frame through smooth, coherent motion sequences.

fal-ai/turbo-flux-trainer

A blazing fast FLUX dev LoRA trainer for subjects and styles.

fal-ai/framepack

Framepack is an efficient Image-to-video model that autoregressively generates videos.

fal-ai/cartoonify

Transform images into 3D cartoon artwork using an AI model that applies cartoon stylization while preserving the original image's composition and details.

fal-ai/wan-vace

Vace a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.

fal-ai/finegrain-eraser/mask

Finegrain Eraser removes any object selected with a mask—along with its shadows, reflections, and lighting artifacts—seamlessly reconstructing the scene with contextually accurate content.

fal-ai/finegrain-eraser/bbox

Finegrain Eraser removes any object selected with a bounding box—along with its shadows, reflections, and lighting artifacts—seamlessly reconstructing the scene with contextually accurate content.

fal-ai/finegrain-eraser

Finegrain Eraser removes objects—along with their shadows, reflections, and lighting artifacts—using only natural language, seamlessly filling the scene with contextually accurate content.

fal-ai/speech-to-text/turbo

Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.

fal-ai/speech-to-text/turbo/stream

Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.

fal-ai/speech-to-text/stream

Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.

fal-ai/speech-to-text

Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.

cassetteai/sound-effects-generator

Create stunningly realistic sound effects in seconds - CassetteAI's Sound Effects Model generates high-quality SFX up to 30 seconds long in just 1 second of processing time

fal-ai/sync-lipsync/v2

Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization with Sync Lipsync 2.0 model

fal-ai/star-vector

AI vectorization model that transforms raster images into scalable SVG graphics, preserving visual details while enabling infinite scaling and easy editing capabilities.

fal-ai/pixverse/v4/image-to-video/fast

Generate fast high quality video clips from text and image prompts using PixVerse v4

fal-ai/pixverse/v4/image-to-video

Generate high quality video clips from text and image prompts using PixVerse v4

fal-ai/pixverse/v3.5/effects

Generate high quality video clips with different effects using PixVerse v3.5

fal-ai/pixverse/v4/text-to-video

Generate high quality video clips from text and image prompts using PixVerse v4

fal-ai/pixverse/v3.5/transition

Create seamless transition between images using PixVerse v3.5

fal-ai/pixverse/v4/text-to-video/fast

Generate high quality and fast video clips from text and image prompts using PixVerse v4 fast

fal-ai/ghiblify

Reimagine and transform your ordinary photos into enchanting Studio Ghibli style artwork

fal-ai/orpheus-tts

Orpheus TTS is a state-of-the-art, Llama-based Speech-LLM designed for high-quality, empathetic text-to-speech generation. This model has been finetuned to deliver human-level speech synthesis, achieving exceptional clarity, expressiveness, and real-time performances.

voice synthesis

fal-ai/sana/sprint

Sana Sprint is a text-to-image model capable of generating 4K images with exceptional speed.

fal-ai/sana/v1.5/1.6b

Sana v1.5 1.6B is a lightweight text-to-image model that delivers 4K image generation with impressive efficiency.

fal-ai/sana/v1.5/4.8b

Sana v1.5 4.8B is a powerful text-to-image model that generates ultra-high quality 4K images with remarkable detail.

fal-ai/kling-video/lipsync/text-to-video

Kling LipSync is a text-to-video model that generates realistic lip movements from text input.

fal-ai/kling-video/lipsync/audio-to-video

Kling LipSync is an audio-to-video model that generates realistic lip movements from audio input.

fal-ai/latentsync

LatentSync is a video-to-video model that generates lip sync animations from audio using advanced algorithms for high-quality synchronization.

fal-ai/wan-t2v-lora

Add custom LoRAs to Wan-2.1 is a text-to-video model that generates high-quality videos with high visual quality and motion diversity from images

"text to video"

fal-ai/wan-trainer

Train custom LoRAs for Wan-2.1

Fix low resolution images with fast speed and quality of thera.

fal-ai/mix-dehaze-net

An advanced dehaze model to remove atmospheric haze, restoring clarity and detail in images through intelligent neural network processing.

fal-ai/gemini-flash-edit

Gemini Flash Edit is a model that can edit single image using a text prompt and a reference image.

fal-ai/gemini-flash-edit/multi

Gemini Flash Edit Multi Image is a model that can edit multiple images using a text prompt and a reference image.

fal-ai/hunyuan3d/v2/multi-view

Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.

fal-ai/hunyuan3d/v2/mini

Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.

fal-ai/hunyuan3d/v2/mini/turbo

Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.

fal-ai/hunyuan3d/v2

Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.

fal-ai/hunyuan3d/v2/multi-view/turbo

Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.

fal-ai/hunyuan3d/v2/turbo

Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.

fal-ai/luma-dream-machine/ray-2-flash/image-to-video

Ray2 Flash is a fast video generative model capable of creating realistic visuals with natural, coherent motion.

fal-ai/luma-dream-machine/ray-2-flash

Ray2 Flash is a fast video generative model capable of creating realistic visuals with natural, coherent motion.

fal-ai/pika/v2/turbo/image-to-video

Pika v2 Turbo creates videos from images with high quality output.

fal-ai/pika/v2.1/text-to-video

Pika v2.1 creates videos from a text prompt with high quality output.

fal-ai/pika/v1.5/pikaffects

Pika Effects are AI-powered video effects designed to modify objects, characters, and environments in a fun, engaging, and visually compelling manner.

fal-ai/pika/v2/turbo/text-to-video

Pika v2 Turbo creates videos from a text prompt with high quality output.

fal-ai/pika/v2.1/image-to-video

Pika v2.1 creates videos from images with high quality output.

fal-ai/pika/v2.2/image-to-video

Pika v2.2 creates videos from images with high quality output.

fal-ai/pika/v2.2/text-to-video

Pika v2.2 creates videos from a text prompt with high quality output.

fal-ai/pika/v2.2/pikascenes

Pika Scenes v2.2 creates videos from a images with high quality output.

fal-ai/invisible-watermark

Invisible Watermark is a model that can add an invisible watermark to an image.

CSM (Conversational Speech Model) is a speech generation model from Sesame that generates RVQ audio codes from text and audio inputs.

fal-ai/vidu/start-end-to-video

Vidu Start-End to Video generates smooth transition videos between specified start and end images.

fal-ai/vidu/reference-to-video

Vidu Reference to Video creates videos by using a reference images and combining them with a prompt.

fal-ai/vidu/image-to-video

Vidu Image to Video generates high-quality videos with exceptional visual quality and motion diversity from a single image

fal-ai/vidu/template-to-video

Vidu Template to Video lets you create different effects by applying motion templates to your images.

fal-ai/wan-pro/text-to-video

Wan-2.1 Pro is a premium text-to-video model that generates high-quality 1080p videos at 30fps with up to 6 seconds duration, delivering exceptional visual quality and motion diversity from text prompts

easel-ai/advanced-face-swap

Swap faces of one or two people at once, while preserving user and scene details!

fal-ai/kling-video/v1.6/pro/effects

Generate video clips from your prompts using Kling 1.6 (pro)

fal-ai/kling-video/v1/standard/effects

Generate video clips from your prompts using Kling 1.0

fal-ai/kling-video/v1.5/pro/effects

Generate video clips from your prompts using Kling 1.5 (pro)

fal-ai/kling-video/v1.6/standard/effects

Generate video clips from your prompts using Kling 1.6 (std)

fal-ai/hunyuan-video-image-to-video

Image to Video for the high-quality Hunyuan Video I2V model.

fal-ai/ltx-video-v095/image-to-video

Generate videos from prompts and images using LTX Video-0.9.5

fal-ai/ltx-video-v095/multiconditioning

Generate videos from prompts,images, and videos using LTX Video-0.9.5

fal-ai/ltx-video-v095

Generate videos from prompts using LTX Video-0.9.5

fal-ai/ltx-video-v095/extend

Generate videos from prompts and videos using LTX Video-0.9.5

rundiffusion-fal/juggernaut-flux/base/image-to-image

Juggernaut Base Flux by RunDiffusion is a drop-in replacement for Flux [Dev] that delivers sharper details, richer colors, and enhanced realism, while instantly boosting LoRAs and LyCORIS with full compatibility.

image generation

rundiffusion-fal/juggernaut-flux-lora

Juggernaut Base Flux LoRA by RunDiffusion is a drop-in replacement for Flux [Dev] that delivers sharper details, richer colors, and enhanced realism to all your LoRAs and LyCORIS with full compatibility.

image generation

rundiffusion-fal/juggernaut-flux/base

Juggernaut Base Flux by RunDiffusion is a drop-in replacement for Flux [Dev] that delivers sharper details, richer colors, and enhanced realism, while instantly boosting LoRAs and LyCORIS with full compatibility.

image generation

rundiffusion-fal/juggernaut-flux/pro

Juggernaut Pro Flux by RunDiffusion is the flagship Juggernaut model rivaling some of the most advanced image models available, often surpassing them in realism. It combines Juggernaut Base with RunDiffusion Photo and features enhancements like reduced background blurriness.

image generation

rundiffusion-fal/juggernaut-flux/lightning

Juggernaut Lightning Flux by RunDiffusion provides blazing-fast, high-quality images rendered at five times the speed of Flux. Perfect for mood boards and mass ideation, this model excels in both realism and prompt adherence.

image generation

rundiffusion-fal/juggernaut-flux/pro/image-to-image

Juggernaut Pro Flux by RunDiffusion is the flagship Juggernaut model rivaling some of the most advanced image models available, often surpassing them in realism. It combines Juggernaut Base with RunDiffusion Photo and features enhancements like reduced background blurriness.

image generation

rundiffusion-fal/rundiffusion-photo-flux

RunDiffusion Photo Flux provides insane realism. With this enhancer, textures and skin details burst to life, turning your favorite prompts into vivid, lifelike creations. Recommended to keep it at 0.65 to 0.80 weight. Supports resolutions up to 1536x1536.

image generation

fal-ai/cogview4

Generate high quality images from text prompts using CogView4. Longer text prompts will result in better quality images.

fal-ai/diffrhythm

DiffRhythm is a blazing fast model for transforming lyrics into full songs. It boasts the capability to generate full songs in less than 30 seconds.

fal-ai/topaz/upscale/video

Professional-grade video upscaling using Topaz technology. Enhance your videos with high-quality upscaling.

fal-ai/docres/dewarp

Enhance wraped, folded documents with the superior quality of docres for sharper, clearer results.

image-enhancement

Enhance low-resolution, blur, shadowed documents with the superior quality of docres for sharper, clearer results.

image-enhancement

Enhance low-resolution images with the superior quality of Swin2SR for sharper, clearer results.

image-enhancement

fal-ai/wan/v2.1/1.3b/text-to-video

Wan-2.1 1.3B is a text-to-video model that generates high-quality videos with high visual quality and motion diversity from text promptsat faster speeds.

fal-ai/kling-video/v1.6/pro/text-to-video

Generate video clips from your prompts using Kling 1.6 (pro)

fal-ai/elevenlabs/sound-effects

Generate sound effects using ElevenLabs advanced sound effects model.

fal-ai/ideogram/v2a/turbo/remix

Rapidly create image variations with Ideogram V2A Turbo Remix. Fast and efficient reimagining of existing images while maintaining creative control through prompt guidance.

fal-ai/elevenlabs/tts/multilingual-v2

Generate multilingual text-to-speech audio using ElevenLabs TTS Multilingual v2.

fal-ai/elevenlabs/tts/turbo-v2.5

Generate high-speed text-to-speech audio using ElevenLabs TTS Turbo v2.5.

fal-ai/ideogram/v2a

Generate high-quality images, posters, and logos with Ideogram V2A. Features exceptional typography handling and realistic outputs optimized for commercial and creative use.

fal-ai/elevenlabs/audio-isolation

Isolate audio tracks using ElevenLabs advanced audio isolation technology.

fal-ai/ideogram/v2a/turbo

Accelerated image generation with Ideogram V2A Turbo. Create high-quality visuals, posters, and logos with enhanced speed while maintaining Ideogram's signature quality.

fal-ai/ideogram/v2a/remix

Create variations of existing images with Ideogram V2A Remix while maintaining creative control through prompt guidance.

fal-ai/elevenlabs/speech-to-text

Generate text from speech using ElevenLabs advanced speech-to-text model.

EVF-SAM2 combines natural language understanding with advanced segmentation capabilities, allowing you to precisely mask image regions using intuitive positive and negative text prompts.

Bring colors into old or new black and white photos with DDColor.

image-recolorization

Wan-2.1 is a text-to-video model that generates high-quality videos with high visual quality and motion diversity from text prompts

fal-ai/sam2/auto-segment

SAM 2 is a model for segmenting images automatically. It can return individual masks or a single mask for the entire image.

fal-ai/video-prompt-generator

Generate video prompts using a variety of techniques including camera direction, style, pacing, special effects and more.

fal-ai/drct-super-resolution

Upscale your images with DRCT-Super-Resolution.

fal-ai/minimax/video-01-director/image-to-video

Generate video clips more accurately with respect to initial image, natural language descriptions, and using camera movement instructions for shot control.

camera-controls

Veo 2 creates videos with realistic motion and high quality output. Explore different styles and find your own with extensive camera controls.

fal-ai/nafnet/deblur

Use NAFNet to fix issues like blurriness and noise in your images. This model specializes in image restoration and can help enhance the overall quality of your photography.

image-restoration

fal-ai/nafnet/denoise

Use NAFNet to fix issues like blurriness and noise in your images. This model specializes in image restoration and can help enhance the overall quality of your photography.

image-restoration

fal-ai/skyreels-i2v

SkyReels V1 is the first and most advanced open-source human-centric video foundation model. By fine-tuning HunyuanVideo on O(10M) high-quality film and television clips

fal-ai/post-processing

Post Processing is an endpoint that can enhance images using a variety of techniques including grain, blur, sharpen, and more.

fal-ai/kokoro/french

An expressive and natural French text-to-speech model for both European and Canadian French.

fal-ai/kokoro/american-english

Kokoro is a lightweight text-to-speech model that delivers comparable quality to larger models while being significantly faster and more cost-efficient.

fal-ai/kokoro/hindi

A fast and expressive Hindi text-to-speech model with clear pronunciation and accurate intonation.

fal-ai/kokoro/mandarin-chinese

A highly efficient Mandarin Chinese text-to-speech model that captures natural tones and prosody.

fal-ai/kokoro/british-english

A high-quality British English text-to-speech model offering natural and expressive voice synthesis.

Clone voice of any person and speak anything in their voice using zonos' voice cloning.

fal-ai/kokoro/brazilian-portuguese

A natural and expressive Brazilian Portuguese text-to-speech model optimized for clarity and fluency.

fal-ai/luma-dream-machine/ray-2/image-to-video

Ray2 is a large-scale video generative model capable of creating realistic visuals with natural, coherent motion.

fal-ai/kokoro/spanish

A natural-sounding Spanish text-to-speech model optimized for Latin American and European Spanish.

fal-ai/kokoro/italian

A high-quality Italian text-to-speech model delivering smooth and expressive speech synthesis.

fal-ai/kokoro/japanese

A fast and natural-sounding Japanese text-to-speech model optimized for smooth pronunciation.

fal-ai/flowedit

The model provides you high quality image editing capabilities.

fal-ai/got-ocr/v2

GOT-OCR2 works on a wide range of tasks, including plain document OCR, scene text OCR, formatted document OCR, and even OCR for tables, charts, mathematical formulas, geometric shapes, molecular formulas and sheet music.

optical character recognition

fal-ai/ben/v2/video

A model for high quality and smooth background removal for videos.

background removal

fal-ai/flux-control-lora-depth/image-to-image

FLUX Control LoRA Depth is a high-performance endpoint that uses a control image using a depth map to transfer structure to the generated image and another initial image to guide color.

fal-ai/minimax/video-01-director

Generate video clips more accurately with respect to natural language descriptions and using camera movement instructions for shot control.

camera-controls

fal-ai/flux-control-lora-canny/image-to-image

FLUX Control LoRA Canny is a high-performance endpoint that uses a control image using a Canny edge map to transfer structure to the generated image and another initial image to guide color.

fal-ai/ben/v2/image

A fast and high quality model for image background removal.

background removal

fal-ai/flux-control-lora-depth

FLUX Control LoRA Depth is a high-performance endpoint that uses a control image to transfer structure to the generated image, using a depth map.

fal-ai/flux-control-lora-canny

FLUX Control LoRA Canny is a high-performance endpoint that uses a control image to transfer structure to the generated image, using a Canny edge map.

fal-ai/ideogram/upscale

Ideogram Upscale enhances the resolution of the reference image by up to 2X and might enhance the reference image too. Optionally refine outputs with a prompt for guided improvements.

Imagen3 is a high-quality text-to-image model that generates realistic images from text prompts.

fal-ai/imagen3/fast

Imagen3 Fast is a high-quality text-to-image model that generates realistic images from text prompts.

fal-ai/hunyuan-video-img2vid-lora

Image to Video for the Hunyuan Video model using a custom trained LoRA.

fal-ai/lumina-image/v2

Lumina-Image-2.0 is a 2 billion parameter flow-based diffusion transforer which features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.

fal-ai/codeformer

Fix distorted or blurred photos of people with CodeFormer.

image-restoration

fal-ai/hunyuan-video-lora/video-to-video

Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability. Use this endpoint to generate videos from videos.

fal-ai/hunyuan-video/video-to-video

Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability. Use this endpoint to generate videos from videos.

fal-ai/pixverse/v3.5/image-to-video/fast

Generate high quality video clips from text and image prompts quickly using PixVerse v3.5 Fast

fal-ai/pixverse/v3.5/text-to-video/fast

Generate high quality video clips quickly from text prompts using PixVerse v3.5 Fast

fal-ai/pixverse/v3.5/image-to-video

Generate high quality video clips from text and image prompts using PixVerse v3.5

fal-ai/pixverse/v3.5/text-to-video

Generate high quality video clips from text prompts using PixVerse v3.5

YuE is a groundbreaking series of open-source foundation models designed for music generation, specifically for transforming lyrics into full songs.

DeepSeek Janus-Pro is a novel text-to-image model that unifies multimodal understanding and generation through an autoregressive framework

fal-ai/luma-dream-machine/ray-2

Ray2 is a large-scale video generative model capable of creating realistic visuals with natural, coherent motion.

fal-ai/kling/v1-5/kolors-virtual-try-on

Kling Kolors Virtual TryOn v1.5 is a high quality image based Try-On endpoint which can be used for commercial try on.

fal-ai/ffmpeg-api/metadata

Get encoding metadata from video and audio files using FFmpeg API.

fal-ai/ffmpeg-api/waveform

Get waveform data from audio files using FFmpeg API.

fal-ai/ffmpeg-api/compose

Compose videos from multiple media sources using FFmpeg API.

fal-ai/minimax/video-01-subject-reference

Generate video clips maintaining consistent, realistic facial features and identity across dynamic video content

fal-ai/moondream-next/batch

MoonDreamNext Batch is a multimodal vision-language model for batch captioning.

fal-ai/hunyuan-video-lora

Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability

fal-ai/flux-pro/v1/canny-finetuned

Utilize Flux.1 [pro] Controlnet with a fine-tuned LoRA to generate high-quality images with precise control over composition, style, and structure through advanced edge detection and guidance mechanisms.

fal-ai/flux-pro/v1/fill-finetuned

FLUX.1 [pro] Fill Fine-tuned is a high-performance endpoint for the FLUX.1 [pro] model with a fine-tuned LoRA that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

fal-ai/flux-pro-trainer

FLUX LoRA for Pro endpoints.

personalization

fal-ai/flux-lora-canny

Utilize Flux.1 [dev] Controlnet to generate high-quality images with precise control over composition, style, and structure through advanced edge detection and guidance mechanisms.

fal-ai/flux-pro/v1/canny

Utilize Flux.1 [pro] Controlnet to generate high-quality images with precise control over composition, style, and structure through advanced edge detection and guidance mechanisms.

fal-ai/flux-pro/v1.1-ultra-finetuned

FLUX1.1 [pro] ultra fine-tuned is the newest version of FLUX1.1 [pro] with a fine-tuned LoRA, maintaining professional-grade image quality while delivering up to 2K resolution with improved photo realism.

fal-ai/flux-pro/v1/depth

Generate high-quality images from depth maps using Flux.1 [pro] depth estimation model. The model produces accurate depth representations for scene understanding and 3D visualization.

fal-ai/flux-pro/v1/depth-finetuned

Generate high-quality images from depth maps using Flux.1 [pro] depth estimation model with a fine-tuned LoRA. The model produces accurate depth representations for scene understanding and 3D visualization.

fal-ai/flux-pro/v1.1

FLUX1.1 [pro] is an enhanced version of FLUX.1 [pro], improved image generation capabilities, delivering superior composition, detail, and artistic fidelity compared to its predecessor.

fal-ai/cogvideox-5b

Generate videos from prompts using CogVideoX-5B

fal-ai/transpixar

Transform text into stunning videos with TransPixar - an AI model that generates both RGB footage and alpha channels, enabling seamless compositing and creative video effects.

fal-ai/hunyuan-video-lora-training

Train Hunyuan Video lora on people, objects, characters and more!

personalization

fal-ai/sync-lipsync

Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization.

fal-ai/sa2va/4b/video

Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels

fal-ai/sa2va/8b/image

Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels

fal-ai/sa2va/8b/video

Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels

fal-ai/sa2va/4b/image

Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels

fal-ai/moondream-next/detection

MoonDreamNext Detection is a multimodal vision-language model for gaze detection, bbox detection, point detection, and more.

fal-ai/moondream-next

MoonDreamNext is a multimodal vision-language model for captioning, gaze detection, bbox detection, point detection, and more.

fal-ai/kling-video/v1.6/standard/image-to-video

Generate video clips from your images using Kling 1.6 (std)

fal-ai/kling-video/v1.6/standard/text-to-video

Generate video clips from your prompts using Kling 1.6 (std)

fal-ai/auto-caption

Automatically generates text captions for your videos from the audio as per text colour/font specifications

fal-ai/switti/512

Switti is a scale-wise transformer for fast text-to-image generation that outperforms existing T2I AR models and competes with state-of-the-art T2I diffusion models while being faster than distilled diffusion models.

Switti is a scale-wise transformer for fast text-to-image generation that outperforms existing T2I AR models and competes with state-of-the-art T2I diffusion models while being faster than distilled diffusion models.

This endpoint delivers seamlessly localized videos by generating lip-synced dubs in multiple languages, ensuring natural and immersive multilingual experiences

fal-ai/mmaudio-v2/text-to-audio

MMAudio generates synchronized audio given text inputs. It can generate sounds described by a prompt.

fal-ai/sadtalker/reference

Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

fal-ai/bria/background/remove

Bria RMBG 2.0 enables seamless removal of backgrounds from images, ideal for professional editing tasks. Trained exclusively on licensed data for safe and risk-free commercial use. Model weights for commercial use are available here: https://share-eu1.hsforms.com/2GLpEVQqJTI2Lj7AMYwgfIwf4e04

background removal

image segmentation

high resolution

fal-ai/bria/text-to-image/base

Bria's Text-to-Image model, trained exclusively on licensed data for safe and risk-free commercial use. Available also as source code and weights. For access to weights: https://bria.ai/contact-us

image generation

fal-ai/bria/background/replace

Bria Background Replace allows for efficient swapping of backgrounds in images via text prompts or reference image, delivering realistic and polished results. Trained exclusively on licensed data for safe and risk-free commercial use

fal-ai/bria/text-to-image/fast

Bria's Text-to-Image model with perfect harmony of latency and quality. Trained exclusively on licensed data for safe and risk-free commercial use. Available also as source code and weights. For access to weights: https://bria.ai/contact-us

image generation

fal-ai/bria/expand

Bria Expand expands images beyond their borders in high quality. Trained exclusively on licensed data for safe and risk-free commercial use. Access the model's source code and weights: https://bria.ai/contact-us

fal-ai/bria/genfill

Bria GenFill enables high-quality object addition or visual transformation. Trained exclusively on licensed data for safe and risk-free commercial use. Access the model's source code and weights: https://bria.ai/contact-us

fal-ai/playai/tts/v3

Blazing-fast text-to-speech. Generate audio with improved emotional tones and extensive multilingual support. Ideal for high-volume processing and efficient workflows.

fal-ai/bria/text-to-image/hd

Bria's Text-to-Image model for HD images. Trained exclusively on licensed data for safe and risk-free commercial use. Available also as source code and weights. For access to weights: https://bria.ai/contact-us

image generation

fal-ai/bria/product-shot

Place any product in any scenery with just a prompt or reference image while maintaining high integrity of the product. Trained exclusively on licensed data for safe and risk-free commercial use and optimized for eCommerce.

product photography

fal-ai/flux-lora-fill

FLUX.1 [dev] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

fal-ai/bria/eraser

Bria Eraser enables precise removal of unwanted objects from images while maintaining high-quality outputs. Trained exclusively on licensed data for safe and risk-free commercial use. Access the model's source code and weights: https://bria.ai/contact-us

fal-ai/cat-vton

Image based high quality Virtual Try-On

fal-ai/leffa/pose-transfer

Leffa Pose Transfer is an endpoint for changing pose of an image with a reference image.

fal-ai/leffa/virtual-tryon

Leffa Virtual TryOn is a high quality image based Try-On endpoint which can be used for commercial try on.

fal-ai/minimax-music

Generate music from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality, diverse musical compositions.

fal-ai/hyper3d/rodin

Rodin by Hyper3D generates realistic and production ready 3D models from text or images.

fal-ai/minimax/video-01-live

Generate video clips from your prompts using MiniMax model

fal-ai/recraft-20b

Recraft 20b is a new and affordable text-to-image model.

image generation

fal-ai/minimax/video-01-live/image-to-video

Generate video clips from your images using MiniMax Video model

fal-ai/ideogram/v2/edit

Transform existing images with Ideogram V2's editing capabilities. Modify, adjust, and refine images while maintaining high fidelity and realistic outputs with precise prompt control.

Generate 3D models from your images using Trellis. A native 3D generative model enabling versatile and high-quality 3D asset creation.

fal-ai/luma-dream-machine

Generate video clips from your prompts using Luma Dream Machine v1.5

fal-ai/ideogram/v2/turbo/remix

Rapidly create image variations with Ideogram V2 Turbo Remix. Fast and efficient reimagining of existing images while maintaining creative control through prompt guidance.

fal-ai/ideogram/v2/turbo

Accelerated image generation with Ideogram V2 Turbo. Create high-quality visuals, posters, and logos with enhanced speed while maintaining Ideogram's signature quality.

fal-ai/ideogram/v2/turbo/edit

Edit images faster with Ideogram V2 Turbo. Quick modifications and adjustments while preserving the high-quality standards and realistic outputs of Ideogram.

fal-ai/ideogram/v2/remix

Reimagine existing images with Ideogram V2's remix feature. Create variations and adaptations while preserving core elements and adding new creative directions through prompt guidance.

fal-ai/video-upscaler

The video upscaler endpoint uses RealESRGAN on each frame of the input video to upscale the video to a higher resolution.

video generation

high fidelity motion

fal-ai/luma-photon/flash

Generate images from your prompts using Luma Photon Flash. Photon Flash is the most creative, personalizable, and intelligent visual models for creatives, bringing a step-function change in the cost of high-quality image generation.

fal-ai/kling-video/v1/standard/text-to-video

Generate video clips from your prompts using Kling 1.0

fal-ai/aura-flow

AuraFlow v0.3 is an open-source flow-based text-to-image generation model that achieves state-of-the-art results on GenEval. The model is currently in beta.

fal-ai/omnigen-v1

OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It can be used for various tasks such as Image Editing, Personalized Image Generation, Virtual Try-On, Multi Person Generation and more!

fal-ai/flux/schnell/redux

FLUX.1 [schnell] Redux is a high-performance endpoint for the FLUX.1 [schnell] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

fal-ai/kling-video/v1.5/pro/text-to-video

Generate video clips from your prompts using Kling 1.5 (pro)

fal-ai/flux/schnell

FLUX.1 [schnell] is a 12 billion parameter flow transformer that generates high-quality images from text in 1 to 4 steps, suitable for personal and commercial use.

fal-ai/flux-pro/v1/fill

FLUX.1 [pro] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

fal-ai/flux-pro/v1/redux

FLUX.1 [pro] Redux is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

fal-ai/flux-lora-depth

Generate high-quality images from depth maps using Flux.1 [dev] depth estimation model. The model produces accurate depth representations for scene understanding and 3D visualization.

fal-ai/flux/dev/redux

FLUX.1 [dev] Redux is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

fal-ai/flux-pro/v1.1-ultra/redux

FLUX1.1 [pro] ultra Redux is a high-performance endpoint for the FLUX1.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

fal-ai/flux-pro/v1.1/redux

FLUX1.1 [pro] Redux is a high-performance endpoint for the FLUX1.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

fal-ai/ltx-video/image-to-video

Generate videos from images using LTX Video

fal-ai/kolors/image-to-image

Photorealistic Image-to-Image

fal-ai/iclight-v2

An endpoint for re-lighting photos and changing their backgrounds per a given description

fal-ai/mochi-v1

Mochi 1 preview is an open state-of-the-art video generation model with high-fidelity motion and strong prompt adherence in preliminary evaluation.

fal-ai/flux-differential-diffusion

FLUX.1 Differential Diffusion is a rapid endpoint that enables swift, granular control over image transformations through change maps, delivering fast and precise region-specific modifications while maintaining FLUX.1 [dev]'s high-quality output.

fal-ai/flux-pulid

An endpoint for personalized image generation using Flux as per given description.

personalization

fal-ai/birefnet/v2

bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS)

background removal

fal-ai/stable-diffusion-v35-medium

Stable Diffusion 3.5 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.

fal-ai/hunyuan-video

Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability. This endpoint generates videos from text descriptions.

fal-ai/cogvideox-5b/image-to-video

Generate videos from images and prompts using CogVideoX-5B

fal-ai/cogvideox-5b/video-to-video

Generate videos from videos and prompts using CogVideoX-5B

F5 TTS

fal-ai/any-llm/vision

Use any vision language model from our selected catalogue (powered by OpenRouter)

fal-ai/kling-video/v1.5/pro/image-to-video

Generate video clips from your images using Kling 1.5 (pro)

fal-ai/ltx-video

Generate videos from prompts using LTX Video

fal-ai/kling-video/v1/standard/image-to-video

Generate video clips from your images using Kling 1.0

fal-ai/kling-video/v1/pro/text-to-video

Generate video clips from your prompts using Kling 1.0 (pro)

fal-ai/kling-video/v1/pro/image-to-video

Generate video clips from your images using Kling 1.0 (pro)

fal-ai/flux-pro/new

FLUX.1 [pro] new is an accelerated version of FLUX.1 [pro], maintaining professional-grade image quality while delivering significantly faster generation speeds.

fal-ai/live-portrait/image

Transfer expression from a video to a portrait.

fal-ai/flux-general/rf-inversion

A general purpose endpoint for the FLUX.1 [dev] model, implementing the RF-Inversion pipeline. This can be used to edit a reference image based on a prompt.

fal-ai/stable-video

Generate short video clips from your images using SVD v1.1

fal-ai/image-preprocessors/zoe

ZoeDepth preprocessor.

fal-ai/image-preprocessors/mlsd

M-LSD line segment detection preprocessor.

fal-ai/image-preprocessors/midas

MiDaS depth estimation preprocessor.

fal-ai/image-preprocessors/scribble

Scribble preprocessor.

fal-ai/image-preprocessors/teed

TEED (Temporal Edge Enhancement Detection) preprocessor.

fal-ai/fast-svd/text-to-video

Generate short video clips from your prompts using SVD v1.1

fal-ai/image-preprocessors/lineart

Line art preprocessor.

fal-ai/image-preprocessors/hed

Holistically-Nested Edge Detection (HED) preprocessor.

fal-ai/image-preprocessors/sam

Segment Anything Model (SAM) preprocessor.

fal-ai/image-preprocessors/pidi

PIDI (Pidinet) preprocessor.

fal-ai/image-preprocessors/depth-anything/v2

Depth Anything v2 preprocessor.

fal-ai/controlnext

Animate a reference image with a driving video using ControlNeXt.

fal-ai/stable-diffusion-v3-medium

Stable Diffusion 3 Medium (Text to Image) is a Multimodal Diffusion Transformer (MMDiT) model that improves image quality, typography, prompt understanding, and efficiency.

fal-ai/sam2/video

SAM 2 is a model for segmenting images and videos in real-time.

fal-ai/sam2/image

SAM 2 is a model for segmenting images and videos in real-time.

fal-ai/flux-general/inpainting

FLUX General Inpainting is a versatile endpoint that enables precise image editing and completion, supporting multiple AI extensions including LoRA, ControlNet, and IP-Adapter for enhanced control over inpainting results and sophisticated image modifications.

fal-ai/flux-general/image-to-image

FLUX General Image-to-Image is a versatile endpoint that transforms existing images with support for LoRA, ControlNet, and IP-Adapter extensions, enabling precise control over style transfer, modifications, and artistic variations through multiple guidance methods.

fal-ai/flux-lora/image-to-image

FLUX LoRA Image-to-Image is a high-performance endpoint that transforms existing images using FLUX models, leveraging LoRA adaptations to enable rapid and precise image style transfer, modifications, and artistic variations.

fal-ai/flux-general/differential-diffusion

A specialized FLUX endpoint combining differential diffusion control with LoRA, ControlNet, and IP-Adapter support, enabling precise, region-specific image transformations through customizable change maps.

fal-ai/fooocus/upscale-or-vary

Default parameters with automated optimizations and quality improvements.

fal-ai/pixart-sigma

Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

fal-ai/flux-subject

Super fast endpoint for the FLUX.1 [schnell] model with subject input capabilities, enabling rapid and high-quality image generation for personalization, specific styles, brand identities, and product-specific outputs.

personalization

Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, with the ability to generate 4K images in less than a second.

fal-ai/sdxl-controlnet-union

An efficent SDXL multi-controlnet text-to-image model.

fal-ai/sdxl-controlnet-union/image-to-image

An efficent SDXL multi-controlnet image-to-image model.

fal-ai/sdxl-controlnet-union/inpainting

An efficent SDXL multi-controlnet inpainting model.

Photorealistic Text-to-Image

fal-ai/amt-interpolation/frame-interpolation

Interpolate between image frames

fal-ai/live-portrait

Transfer expression from a video to a portrait.

A powerful image to novel multiview model with normals.

fal-ai/stable-cascade

Stable Cascade: Image generation on a smaller & cheaper latent space.

fal-ai/florence-2-large/open-vocabulary-detection

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/region-to-description

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/ocr

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/region-proposal

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/caption-to-phrase-grounding

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/region-to-category

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/detailed-caption

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/ocr-with-region

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/dense-region-caption

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/region-to-segmentation

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/referring-expression-segmentation

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/more-detailed-caption

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/object-detection

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/florence-2-large/caption

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

fal-ai/fast-sdxl

Run SDXL at the speed of light

fal-ai/stable-diffusion-v3-medium/image-to-image

Stable Diffusion 3 Medium (Image to Image) is a Multimodal Diffusion Transformer (MMDiT) model that improves image quality, typography, prompt understanding, and efficiency.

fal-ai/stable-cascade/sote-diffusion

Anime finetune of Würstchen V3.

fal-ai/luma-photon

Generate images from your prompts using Luma Photon. Photon is the most creative, personalizable, and intelligent visual models for creatives, bringing a step-function change in the cost of high-quality image generation.

fal-ai/fast-svd-lcm/text-to-video

Generate short video clips from your images using SVD v1.1 at Lightning Speed

fal-ai/luma-dream-machine/image-to-video

Generate video clips from your images using Luma Dream Machine v1.5

Predict poses.

fal-ai/sd15-depth-controlnet

SD 1.5 ControlNet

SOTA Image Upscaler

fal-ai/playground-v25

State-of-the-art open-source model in aesthetic quality

fal-ai/omni-zero

Any pose, any style, any identity

fal-ai/hyper-sdxl/image-to-image

Hyper-charge SDXL's performance and creativity.

fal-ai/realistic-vision

Generate realistic images.

fal-ai/lightning-models

Collection of SDXL Lightning models.

fal-ai/dreamshaper

Dreamshaper model.

fal-ai/hyper-sdxl/inpainting

Hyper-charge SDXL's performance and creativity.

fal-ai/ip-adapter-face-id

High quality zero-shot personalization

personalization

fal-ai/lora/inpaint

Run Any Stable Diffusion model with customizable LoRA weights.

fal-ai/lora/image-to-image

Run Any Stable Diffusion model with customizable LoRA weights.

fal-ai/fast-sdxl/inpainting

Run SDXL at the speed of light

fal-ai/fast-sdxl/image-to-image

Run SDXL at the speed of light

fal-ai/stable-diffusion-v15

Stable Diffusion v1.5

fal-ai/layer-diffusion

SDXL with an alpha channel.

fal-ai/fast-lightning-sdxl

Run SDXL at the speed of light

fal-ai/musetalk

MuseTalk is a real-time high quality audio-driven lip-syncing model. Use MuseTalk to animate a face with your own audio.

fal-ai/sadtalker

Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

[Experimental] Whisper v3 Large -- but optimized by our inference wizards. Same WER, double the performance!

fal-ai/imageutils/nsfw

Predict the probability of an image being NSFW.

fal-ai/moondream/batched

Answer questions from the images.

fal-ai/fast-fooocus-sdxl/image-to-image

Fooocus extreme speed mode as a standalone app.

fal-ai/face-to-sticker

Create stickers from faces.

fal-ai/photomaker

Customizing Realistic Human Photos via Stacked ID Embedding

personalization

fal-ai/t2v-turbo

Generate short video clips from your prompts

fal-ai/fast-sdxl-controlnet-canny

Generate Images with ControlNet.

fal-ai/creative-upscaler

Create creative upscaled images.

fal-ai/animatediff-v2v/turbo

Re-animate your videos with evolved consistency!

fal-ai/birefnet

bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS)

background removal

fal-ai/playground-v25/inpainting

State-of-the-art open-source model in aesthetic quality

fal-ai/amt-interpolation

Interpolate between video frames

fal-ai/playground-v25/image-to-image

State-of-the-art open-source model in aesthetic quality

fal-ai/fast-lightning-sdxl/image-to-image

Run SDXL at the speed of light

fal-ai/hyper-sdxl

Hyper-charge SDXL's performance and creativity.

fal-ai/fast-animatediff/text-to-video

Animate your ideas!

fal-ai/fast-lightning-sdxl/inpainting

Run SDXL at the speed of light

Whisper is a model for speech transcription and translation.

fal-ai/fast-lcm-diffusion

Run SDXL at the speed of light

fal-ai/fast-lcm-diffusion/inpainting

Run SDXL at the speed of light

fal-ai/fast-lcm-diffusion/image-to-image

Run SDXL at the speed of light

fal-ai/fast-fooocus-sdxl

Fooocus extreme speed mode as a standalone app.

fal-ai/llava-next

Vision

Use any large language model from our selected catalogue (powered by OpenRouter)

fal-ai/retoucher

Automatically retouches faces to smooth skin and remove blemishes.

fal-ai/illusion-diffusion

Create illusions conditioned on image.

fal-ai/imageutils/depth

Create depth maps using Midas depth estimation.

fal-ai/minimax/video-01

Generate video clips from your prompts using MiniMax model

fal-ai/fast-animatediff/turbo/text-to-video

Animate your ideas in lightning speed!

fal-ai/fast-svd-lcm

Generate short video clips from your images using SVD v1.1 at Lightning Speed

fal-ai/fast-animatediff/video-to-video

Re-animate your videos!

fal-ai/fooocus/image-prompt

Default parameters with automated optimizations and quality improvements.

fal-ai/animatediff-v2v

Re-animate your videos with evolved consistency!

fal-ai/fast-animatediff/turbo/video-to-video

Re-animate your videos in lightning speed!

fal-ai/fooocus/inpaint

Default parameters with automated optimizations and quality improvements.

Produce high-quality images with minimal inference steps.

State of the art Image to 3D Object generation

fal-ai/diffusion-edge

Diffusion based high quality edge detection

fal-ai/stable-audio

Open source text-to-audio model.

fal-ai/imageutils/marigold-depth

Create depth maps using Marigold depth estimation.

Tuning-free ID customization.

personalization

fal-ai/fast-sdxl-controlnet-canny/image-to-image

Generate Images with ControlNet.

fal-ai/fast-sdxl-controlnet-canny/inpainting

Generate Images with ControlNet.

Default parameters with automated optimizations and quality improvements.

fal-ai/animatediff-sparsectrl-lcm

Animate Your Drawings with Latent Consistency Models!

fal-ai/lcm-sd15-i2i

Produce high-quality images with minimal inference steps. Optimized for 512x512 input image size.

Inpaint images with SD and SDXL

Upscale images by a given factor.

fal-ai/imageutils/rembg

Remove the background from an image.

background removal

Run Any Stable Diffusion model with customizable LoRA weights.