
Use SeedVR2 to upscale images, retaining seamless tiling

Generate speech from text prompts and different voices using the MiniMax Speech-2.8 Turbo model, which leverages advanced AI techniques to create high-quality text-to-speech.

MuseTalk is a real-time high quality audio-driven lip-syncing model. Use MuseTalk to animate a face with your own audio.

Smart image resize to arbitrary dimensions, powered by Nano Banana Pro with vision-LLM-guided prompting for composition-aware recomposition. Crop, cropping, resize ads.
![Image-to-image editing with FLUX.2 [klein] 9B from Black Forest Labs and custom LoRA. Precise modifications using natural language descriptions and hex color control.](https://refinery.fal.media/url/https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a928dd2%2FyFNW07YLHtp5zuE4eJAW1_e2f89915a1b740559b3c652b0b028296.jpg/tr:w-1920,q-80/yFNW07YLHtp5zuE4eJAW1_e2f89915a1b740559b3c652b0b028296.webp)
Image-to-image editing with FLUX.2 [klein] 9B from Black Forest Labs and custom LoRA. Precise modifications using natural language descriptions and hex color control.
![FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use.](https://refinery.fal.media/url/https%3A%2F%2Fstorage.googleapis.com%2Ffal_cdn%2Ffal%2Ffor%2520videos-4.jpg/tr:w-1920,q-80/for%20videos-4.webp)
FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use.

Remove background from any video with people and objects. No green screen needed.

Generate video clips from your prompts using Kling 1.0

Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. Use the first tts from resemble ai.

Automatically remove backgrounds from videos -perfect for creating clean, professional content without a green screen.
![FLUX.1 [dev] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.](https://refinery.fal.media/url/https%3A%2F%2Fstorage.googleapis.com%2Ffalserverless%2Fgallery%2Fflux_lora.jpg/tr:w-1920,q-80/flux_lora.webp)
FLUX.1 [dev] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

Wan-2.2 image-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts and images. This endpoint supports LoRAs made for Wan 2.2

Finegrain Eraser removes any object selected with a mask—along with its shadows, reflections, and lighting artifacts—seamlessly reconstructing the scene with contextually accurate content.

Extend Veo-Created Videos up to 30 seconds

Generate video clips from your prompts using MiniMax model

SOTA stemming model for voice, drums, bass, guitar and more.

VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video

SAM 2 is a model for segmenting images automatically. It can return individual masks or a single mask for the entire image.

Generate high quality video clips from text and image prompts using PixVerse v5

Inpainting Endpoint for the Qwen Edit Image editing model.

FLUX General Inpainting is a versatile endpoint that enables precise image editing and completion, supporting multiple AI extensions including LoRA, ControlNet, and IP-Adapter for enhanced control over inpainting results and sophisticated image modifications.
![Image-to-image editing with FLUX.2 [klein] 4B Base from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.](https://refinery.fal.media/url/https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8a7f49%2FnKsGN6UMAi6IjaYdkmILC_e20d2097bb984ad589518cf915fe54b4.jpg/tr:w-1920,q-80/nKsGN6UMAi6IjaYdkmILC_e20d2097bb984ad589518cf915fe54b4.webp)
Image-to-image editing with FLUX.2 [klein] 4B Base from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.

Unified image generation with HiDream-O1-Image. Create, edit, and personalize high-resolution images up to 2K—single native model handles text-to-image, editing, and custom subjects without external components.

Generate high quality video clips by swapping person, objects and background using Pixverse Swap.

FFMPEG Utility for Trim Video
The video upscaler endpoint uses RealESRGAN on each frame of the input video to upscale the video to a higher resolution.

Generate music from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality, diverse musical compositions.

FASHN v1.5 delivers precise virtual try-on capabilities, accurately rendering garment details like text and patterns at 576x864 resolution from both on-model and flat-lay photo references.