
Extend and reframe images with Luma Photon Reframe. This advanced tool intelligently expands your visuals, seamlessly blending new content to enhance creativity and adaptability, offering unmatched personalization and quality for creators at a fraction of the cost.

Use vidu Text-to-Image to turn your prompts into reality.

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

Sana v1.5 1.6B is a lightweight text-to-image model that delivers 4K image generation with impressive efficiency.

Run SDXL at the speed of light

Generate videos from images and prompts using CogVideoX-5B

Train LTX-2.3 22B for custom styles and effects.

Interpolate images with FILM - Frame Interpolation for Large Motion

Blend products into backgrounds with automatic perspective and lighting correction

Clone voice of any person and speak anything in their voice using zonos' voice cloning.

Generate video with audio from audio, text and images using LTX-2 Distilled

Qwen Image LoRA training

Audio reasoning variant of NVIDIA's Nemotron 3 Nano Omni. 30B A3B hybrid Transformer-Mamba MoE - accepts audio plus a prompt and returns text.

Generate long videos from images using LongCat Video

Generate professional product photography with realistic lighting and backgrounds.

See how you or others might look at different ages, from younger to older, while preserving core facial features.

Transform your photos into cool plushies while keeping the original characters likeness

PersonaPlex is a real-time, full-duplex speech-to-speech conversational model that enables persona control through text-based role prompts and audio-based voice conditioning.

Add details to faces, enhance face features, remove blur.

Choose the Nth image from an image URL list for workflows.

Generate profiles using 30-50 images of a subject with Phota.

Sana v1.5 4.8B is a powerful text-to-image model that generates ultra-high quality 4K images with remarkable detail.

Generate video clips from your prompts using Kling 1.0

Audio separation with SAM Audio. Isolate any sound using natural language—professional-grade audio editing made simple for creators, researchers, and accessibility applications.

Removes objects and their visual effects using natural language, replacing them with contextually appropriate content

Pika Scenes v2.2 creates videos from a images with high quality output.

Framepack is an efficient Image-to-video model that autoregressively generates videos.
![Super fast endpoint for the FLUX.1 [dev] inpainting model with LoRA support, enabling rapid and high-quality image inpaingting using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.](https://refinery.fal.media/url/https%3A%2F%2Fstorage.googleapis.com%2Ffal_cdn%2Ffal%2FUpscale-2.jpg/tr:w-1920,q-80/Upscale-2.webp)
Super fast endpoint for the FLUX.1 [dev] inpainting model with LoRA support, enabling rapid and high-quality image inpaingting using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.