Search Page 3

Showing 28 of 1400 results

Run any Vision Language Model with fal. Analyze and understand images using Claude (Anthropic), GPT-5 / GPT-4o (OpenAI), Gemini (Google), Grok (xAI), Llama (Meta), Qwen, Pixtral (Mistral), and more. Send one or multiple images for captioning, analysis, OCR, or visual Q&A. Powered by OpenRouter.

vision

Seedance 1.0 Pro, a high quality video generation model developed by Bytedance.

bytedance/seedance/v1/pro/image-to-video

Seedance 1.0 Pro, a high quality video generation model developed by Bytedance.

image-to-video

elevenlabs/tts/multilingual-v2

Generate multilingual text-to-speech audio using ElevenLabs TTS Multilingual v2.

audio

text-to-audio

GPT Image 1.5 generates high-fidelity images with strong prompt adherence, preserving composition, lighting, and fine-grained detail.

gpt-image-1.5/edit

GPT Image 1.5 generates high-fidelity images with strong prompt adherence, preserving composition, lighting, and fine-grained detail.

openai

gpt-image

image-to-image

Generate high-fidelity images from text with Krea 2 Large, supporting aspect ratio, creativity, seed controls, and optional style references.

krea/v2/large/text-to-image

Generate high-fidelity images from text with Krea 2 Large, supporting aspect ratio, creativity, seed controls, and optional style references.

flux/dev/image-to-image

FLUX.1 Image-to-Image is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

style transfer

image-to-image

Nano banana lite is the efficiency-focused model in the image generation family. Sub-2 second latency with cost-effective generation and editing, fast multi-turn local edits, and 14 supported aspect ratios.

new

google/nano-banana-lite/edit

Nano banana lite is the efficiency-focused model in the image generation family. Sub-2 second latency with cost-effective generation and editing, fast multi-turn local edits, and 14 supported aspect ratios.

image-to-image

new

google/nano-banana-2-lite

text-to-image

ByteDance's most advanced text-to-video model, fast tier. Lower latency and cost with cinematic output, native audio, multi-shot editing, and director-level camera control.

bytedance/seedance-2.0/fast/text-to-video

ByteDance's most advanced text-to-video model, fast tier. Lower latency and cost with cinematic output, native audio, multi-shot editing, and director-level camera control.

google/gemini-omni-flash/image-to-video

Animates a still image into video with audio. Extends a single frame into coherent motion, grounded in Gemini's physical understanding of how scenes and subjects behave.

Remove the background from an image.

Image-to-image editing with FLUX.2 [klein] 9B from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.

image-to-image

Generate a video by taking a start frame and an end frame, animating the transition between them while following text-driven style and scene guidance.

kling-video/o3/pro/image-to-video

Generate a video by taking a start frame and an end frame, animating the transition between them while following text-driven style and scene guidance.

image-to-video

Generate videos from images with audio using xAI's Grok Imagine 1.5 Video model.

xai/grok-imagine-video/v1.5/image-to-video

Generate videos from images with audio using xAI's Grok Imagine 1.5 Video model.

FLUX.1 Kontext [max] is a model with greatly improved prompt adherence and typography generation meet premium consistency for editing without compromise on speed.

image-to-image

sam-3/image

SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.

veo3.1/lite/image-to-video

Veo 3.1 Lite balances practical utility with professional capabilities, supporting Text-to-Video and Image-to-Video

[Experimental] Whisper v3 Large -- but optimized by our inference wizards. Same WER, double the performance!

transcription

speech

speech-to-text

kling-video/o3/standard/image-to-video

Generate a video by taking a start frame and an end frame, animating the transition between them while following text-driven style and scene guidance.

image-to-video

Generate video clips from your images using Kling 1.6 (std)

kling-video/v1.6/standard/image-to-video

Generate video clips from your images using Kling 1.6 (std)

image-to-video

elevenlabs/tts/turbo-v2.5

Generate high-speed text-to-speech audio using ElevenLabs TTS Turbo v2.5.

audio

text-to-speech

recraft/v3/text-to-image

Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis.

Faster and more cost effective version of Google's Veo 3.1!

text-to-video

minimax/speech-02-hd

Generate speech from text prompts and different voices using the MiniMax Speech-02 HD model, which leverages advanced AI techniques to create high-quality text-to-speech.

speech

text-to-speech

Kling 3.0 Standard: Top-tier text-to-video with cinematic visuals, fluid motion, and native audio generation, with multi-shot support.

kling-video/v3/standard/text-to-video

Kling 3.0 Standard: Top-tier text-to-video with cinematic visuals, fluid motion, and native audio generation, with multi-shot support.

text-to-video

FLUX.1 [pro] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

flux-pro/v1/fill

FLUX.1 [pro] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

editing

image-to-image

gpt-image-1.5

GPT Image 1.5 generates high-fidelity images with strong prompt adherence, preserving composition, lighting, and fine-grained detail.

Image-to-image editing with FLUX.2 [dev] from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.

image-to-image

Showing 57 to 84 of 1400 results