![Text-to-image generation with FLUX.2 [dev] from Black Forest Labs. Enhanced realism, crisper text generation, and native editing capabilities—all at turbo speed.](https://refinery.fal.media/url/https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a871494%2Fj8F-tmy_dz4TyImvIHj19_510cc93373ef451386734b7e05711de1.jpg/tr:w-1920,q-80/j8F-tmy_dz4TyImvIHj19_510cc93373ef451386734b7e05711de1.webp)
Text-to-image generation with FLUX.2 [dev] from Black Forest Labs. Enhanced realism, crisper text generation, and native editing capabilities—all at turbo speed.
Generate sound effects using ElevenLabs advanced sound effects model.
![[Experimental] Whisper v3 Large -- but optimized by our inference wizards. Same WER, double the performance!](https://refinery.fal.media/url/https%3A%2F%2Fstorage.googleapis.com%2Ffalserverless%2Fgallery%2Fwizper.webp/tr:w-1920,q-80/wizper.webp)
[Experimental] Whisper v3 Large -- but optimized by our inference wizards. Same WER, double the performance!

An endpoint for personalized image generation using Flux as per given description.

Generates same scene from different angles (azimuth/elevation) with Qwen image Edit 2511 and the Lora Multiple Angles
![FLUX.1 [pro] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.](https://refinery.fal.media/url/https%3A%2F%2Fstorage.googleapis.com%2Ffalserverless%2Fgallery%2Ffluxpro.jpg/tr:w-1920,q-80/fluxpro.webp)
FLUX.1 [pro] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

Kling 2.5 Turbo Standard: Top-tier image-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision.

Kling 3.0 Standard: Top-tier text-to-video with cinematic visuals, fluid motion, and native audio generation, with multi-shot support.

Generate 1080p video with synchronized native audio from a text prompt and references. Aspect ratios: 16:9, 9:16, 1:1, 4:3, 3:4. Duration: 3–15s.

Kling 2.1 Pro is an advanced endpoint for the Kling 2.1 model, offering professional-grade videos with enhanced visual fidelity, precise camera movements, and dynamic motion control, perfect for cinematic storytelling.
![Text-to-image generation with FLUX.2 [dev] from Black Forest Labs. Enhanced realism, crisper text generation, and native editing capabilities— in a flash.](https://refinery.fal.media/url/https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a871486%2FtX7YdfQViGtCE7ZjxOCph_5f5262a21e9e426e8981ea9513d11999.jpg/tr:w-1920,q-80/tX7YdfQViGtCE7ZjxOCph_5f5262a21e9e426e8981ea9513d11999.webp)
Text-to-image generation with FLUX.2 [dev] from Black Forest Labs. Enhanced realism, crisper text generation, and native editing capabilities— in a flash.
Generate high-speed text-to-speech audio using ElevenLabs TTS Turbo v2.5.

Gemini 3 Pro Image (a.k.a Nano Banana Pro) is Google's state-of-the-art high-fidelity image generation and editing model
![Fastest inference in the world for the 12 billion parameter FLUX.1 [schnell] text-to-image model.](https://refinery.fal.media/url/https%3A%2F%2Fstorage.googleapis.com%2Ffal_cdn%2Ffal%2FUpscale-2.jpg/tr:w-1920,q-80/Upscale-2.webp)
Fastest inference in the world for the 12 billion parameter FLUX.1 [schnell] text-to-image model.

Transform your photos into ultra-high-resolution 3D models in seconds. Film-quality geometry with PBR textures, ready for games, e-commerce, and 3D printing.

Edit videos using Kling O3 from Kling Team!

Gemini 3.1 Flash Image (a.k.a. Nano Banana 2) is Google's new state-of-the-art fast image generation and editing model

Upscale your images with AuraSR.

Use ffmpeg capabilities to merge 2 or more videos.

Generate videos from a first/last frame using Google's Veo 3.1 Fast

VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video

Enhances a given raster image using 'crisp upscale' tool, boosting resolution with a focus on refining small details and faces.

Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis.

Generate multilingual text-to-speech audio using ElevenLabs TTS Multilingual v2.

Kling 2.6 Pro: Top-tier text-to-video with cinematic visuals, fluid motion, and native audio generation.
![FLUX.2 [max] delivers state-of-the-art image generation and advanced image editing with exceptional realism, precision, and consistency.](https://refinery.fal.media/url/https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8689a8%2Fbbcmo6U5xg_RxDXijtxNA_55df705e1b1b4535a90bccd70887680e.jpg/tr:w-1920,q-80/bbcmo6U5xg_RxDXijtxNA_55df705e1b1b4535a90bccd70887680e.webp)
FLUX.2 [max] delivers state-of-the-art image generation and advanced image editing with exceptional realism, precision, and consistency.

Generate videos from a first and last framed using Google's Veo 3.1
![Experimental version of FLUX.1 Kontext [max] with multi image handling capabilities](https://refinery.fal.media/url/https%3A%2F%2Ffal.media%2Ffiles%2Fkoala%2FK-CKzmh6JmZz5D0L9ar6u_ad445c7c4de54d3fb05b5f72305ffff3.jpg/tr:w-1920,q-80/K-CKzmh6JmZz5D0L9ar6u_ad445c7c4de54d3fb05b5f72305ffff3.webp)
Experimental version of FLUX.1 Kontext [max] with multi image handling capabilities