ByteDance's most advanced text-to-video model. Cinematic output with native audio, multi-shot editing, real-world physics, and director-level camera control.
bytedance/seedance-2.0/text-to-video
text-to-video

ByteDance's most advanced text-to-video model. Cinematic output with native audio, multi-shot editing, real-world physics, and director-level camera control.

stylized
transform
lipsync
Image editing endpoint for the fast Lite version of Seedream 5.0, supporting high quality intelligent image editing with multiple inputs.
bytedance/seedream/v5/lite/edit
image-to-image

Image editing endpoint for the fast Lite version of Seedream 5.0, supporting high quality intelligent image editing with multiple inputs.

bytedance
seedream-5.0-lite
edit
Generate videos from your image prompts using Veo 3.1 fast.
veo3.1/fast/image-to-video
image-to-video

Generate videos from your image prompts using Veo 3.1 fast.

Gemini 3 Pro Image (a.k.a Nano Banana Pro) is Google's state-of-the-art high-fidelity image generation and editing model
gemini-3-pro-image-preview/edit
image-to-image

Gemini 3 Pro Image (a.k.a Nano Banana Pro) is Google's state-of-the-art high-fidelity image generation and editing model

realism
typography
bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS)
birefnet/v2
image-to-image

bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS)

background removal
segmentation
high-res
A new-generation image creation model ByteDance, Seedream 4.5 integrates image generation and image editing capabilities into a single, unified architecture.
bytedance/seedream/v4.5/text-to-image
text-to-image

A new-generation image creation model ByteDance, Seedream 4.5 integrates image generation and image editing capabilities into a single, unified architecture.

stylized
transform
Z-Image Turbo is a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.
z-image/turbo
text-to-image

Z-Image Turbo is a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.

turbo
z-image
fast
Use the powerful and accurate topaz image enhancer to enhance your images.
topaz/upscale/image
image-to-image

Use the powerful and accurate topaz image enhancer to enhance your images.

A new-generation image creation model ByteDance, Seedream 4.0 integrates image generation and image editing capabilities into a single, unified architecture.
bytedance/seedream/v4/edit
image-to-image

A new-generation image creation model ByteDance, Seedream 4.0 integrates image generation and image editing capabilities into a single, unified architecture.

stylized
transform
editing
Text-to-image generation with FLUX.2 [klein] 9B from Black Forest Labs. Enhanced realism, crisper text generation, and native editing capabilities.
flux-2/klein/9b
text-to-image

Text-to-image generation with FLUX.2 [klein] 9B from Black Forest Labs. Enhanced realism, crisper text generation, and native editing capabilities.

Veo 3.1 is the latest state-of-the art video generation model from Google DeepMind
veo3.1/image-to-video
image-to-video

Veo 3.1 is the latest state-of-the art video generation model from Google DeepMind

ByteDance's most advanced image-to-video model, fast tier. Lower latency and cost with synchronized audio, start and end frame control, and motion prompts.
bytedance/seedance-2.0/fast/image-to-video
image-to-video

ByteDance's most advanced image-to-video model, fast tier. Lower latency and cost with synchronized audio, start and end frame control, and motion prompts.

stylized
transform
lipsync
Generate videos with audio with Seedance 1.5 (supports start & end frame)
bytedance/seedance/v1.5/pro/image-to-video
image-to-video

Generate videos with audio with Seedance 1.5 (supports start & end frame)

bytedance
seedance
audio
Run any LLM with fal. Access Claude (Anthropic), ChatGPT / GPT-5 / GPT-4o (OpenAI), Gemini (Google), Grok (xAI), DeepSeek, Llama (Meta), Qwen (Alibaba), Mistral, and 200+ more models through a single API. Supports reasoning, structured output, and streaming. Powered by OpenRouter.
openrouter/router
llm

Run any LLM with fal. Access Claude (Anthropic), ChatGPT / GPT-5 / GPT-4o (OpenAI), Gemini (Google), Grok (xAI), DeepSeek, Llama (Meta), Qwen (Alibaba), Mistral, and 200+ more models through a single API. Supports reasoning, structured output, and streaming. Powered by OpenRouter.

ByteDance's most advanced reference-to-video model, fast tier. Lower latency and cost with up to 9 images, 3 videos, and 3 audio clips as inputs.
bytedance/seedance-2.0/fast/reference-to-video
image-to-video

ByteDance's most advanced reference-to-video model, fast tier. Lower latency and cost with up to 9 images, 3 videos, and 3 audio clips as inputs.

stylized
transform
lipsync
GPT Image 1.5 generates high-fidelity images with strong prompt adherence, preserving composition, lighting, and fine-grained detail.
gpt-image-1.5/edit
image-to-image

GPT Image 1.5 generates high-fidelity images with strong prompt adherence, preserving composition, lighting, and fine-grained detail.

openai
gpt-image
Generate high-quality images, posters, and logos with Ideogram V3. Features exceptional typography handling and realistic outputs optimized for commercial and creative use.
ideogram/v3
text-to-image

Generate high-quality images, posters, and logos with Ideogram V3. Features exceptional typography handling and realistic outputs optimized for commercial and creative use.

realism
typography
OpenAI-compatible chat completions API. Drop-in replacement for the OpenAI API — use any OpenAI SDK or client to access Claude, Gemini, Grok, DeepSeek, Llama, Qwen, Mistral, and all OpenAI models (GPT-5, GPT-4o, o3) through fal. Powered by OpenRouter.
openrouter/router/openai/v1/chat/completions
llm

OpenAI-compatible chat completions API. Drop-in replacement for the OpenAI API — use any OpenAI SDK or client to access Claude, Gemini, Grok, DeepSeek, Llama, Qwen, Mistral, and all OpenAI models (GPT-5, GPT-4o, o3) through fal. Powered by OpenRouter.

Kling 3.0 Pro: Top-tier text-to-video with cinematic visuals, fluid motion, and native audio generation, with multi-shot support.
kling-video/v3/pro/text-to-video
text-to-video

Kling 3.0 Pro: Top-tier text-to-video with cinematic visuals, fluid motion, and native audio generation, with multi-shot support.

Clarity upscaler for upscaling images with high very fidelity.
clarity-upscaler
image-to-image

Clarity upscaler for upscaling images with high very fidelity.

upscaling
bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS)
birefnet
image-to-image

bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS)

background removal
segmentation
high-res
FLUX.1 Image-to-Image is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
flux/dev/image-to-image
image-to-image

FLUX.1 Image-to-Image is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

style transfer
Alibaba's #1-ranked Happy Horse 1.0 — generate 1080p video with synchronized native audio and multilingual lip-sync from text prompts or images.
new
alibaba/happy-horse/image-to-video
image-to-video

Alibaba's #1-ranked Happy Horse 1.0 — generate 1080p video with synchronized native audio and multilingual lip-sync from text prompts or images.

video
happy-horse
Generate text-to-speech audio using Eleven-v3 from ElevenLabs.
elevenlabs/tts/eleven-v3
text-to-audio

Generate text-to-speech audio using Eleven-v3 from ElevenLabs.

audio
Google's famous original image generation and editing model, a.k.a Nano Banana
gemini-25-flash-image/edit
image-to-image

Google's famous original image generation and editing model, a.k.a Nano Banana

image-editing
Text to Image endpoint for the fast Lite version of Seedream 5.0, supporting high quality intelligent text-to-image generation.
bytedance/seedream/v5/lite/text-to-image
text-to-image

Text to Image endpoint for the fast Lite version of Seedream 5.0, supporting high quality intelligent text-to-image generation.

bytedance
seedream-5.0-lite
Generate a video by taking a start frame and an end frame, animating the transition between them while following text-driven style and scene guidance.
kling-video/o3/pro/image-to-video
image-to-video

Generate a video by taking a start frame and an end frame, animating the transition between them while following text-driven style and scene guidance.

Run any Vision Language Model with fal. Analyze and understand images using Claude (Anthropic), GPT-5 / GPT-4o (OpenAI), Gemini (Google), Grok (xAI), Llama (Meta), Qwen, Pixtral (Mistral), and more. Send one or multiple images for captioning, analysis, OCR, or visual Q&A. Powered by OpenRouter.
openrouter/router/vision
vision

Run any Vision Language Model with fal. Analyze and understand images using Claude (Anthropic), GPT-5 / GPT-4o (OpenAI), Gemini (Google), Grok (xAI), Llama (Meta), Qwen, Pixtral (Mistral), and more. Send one or multiple images for captioning, analysis, OCR, or visual Q&A. Powered by OpenRouter.

Showing 29 to 56 of 1355 results