
Ideogram Layerize takes an existing flat graphic, removes text, and returns structured text containers you can edit/recompose in html or json format.

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

Create detailed, fully-textured 3D models with text

Replace or dub audio on an existing video with high-accuracy avatar-inference lip-sync.

Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint.

Wan 2.5 text-to-image model.

Remove unwanted objects or people from your photos while seamlessly blending the background.

SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.
![Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.](https://refinery.fal.media/url/https%3A%2F%2Fv3.fal.media%2Ffiles%2Ftiger%2FfB-RsJ-BW4mrUVAH8oKF2_LOuGVDgg07U8OWbOhhMFt_d6ab08c96ab94da8b6d3e979d634af16.jpg/tr:w-1920,q-80/fB-RsJ-BW4mrUVAH8oKF2_LOuGVDgg07U8OWbOhhMFt_d6ab08c96ab94da8b6d3e979d634af16.webp)
Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.

Generate images from text and images using custom LoRA and Z-Image Turbo, Tongyi-MAI's super-fast 6B model.

Unified image generation with HiDream-O1-Image. Create, edit, and personalize high-resolution images up to 2K—single native model handles text-to-image, editing, and custom subjects without external components.

Replace backgrounds existing images with Ideogram V3's replace background feature. Create variations and adaptations while preserving core elements and adding new creative directions through prompt guidance.

Generate images from your prompts using Luma Photon. Photon is the most creative, personalizable, and intelligent visual models for creatives, bringing a step-function change in the cost of high-quality image generation.

Generate video with audio from audio, text and images using LTX-2

Create high-fidelity video with audio from images with LTX-2 Pro

Wan 2.2's 5B model produces up to 5 seconds of video 720p at 24FPS with fluid motion and powerful prompt understanding

SOTA Image Upscaler

FFMPEG Utilities to Scale Videos

Heygen Translate Model with Extreme Precision

Generate character-consistent videos from reference images using PixVerse C1, with subject and background references.

Generate long, expressive multi-voice speech using Microsoft's powerful TTS

Orpheus TTS is a state-of-the-art, Llama-based Speech-LLM designed for high-quality, empathetic text-to-speech generation. This model has been finetuned to deliver human-level speech synthesis, achieving exceptional clarity, expressiveness, and real-time performances.

Transform images, elements, and text into consistent, high-quality video scenes, ensuring stable character identity, object details, and environments.

Rig humanoid 3D models from GLB URLs with Meshy, returning rigged GLB/FBX files plus basic animations.

Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability. This endpoint generates videos from text descriptions.

Meshy-6 is the latest model from Meshy. It generates realistic and production ready 3D models.

Wan Effects generates high-quality videos with popular effects from images
![Juggernaut Base Flux LoRA by RunDiffusion is a drop-in replacement for Flux [Dev] that delivers sharper details, richer colors, and enhanced realism to all your LoRAs and LyCORIS with full compatibility.](https://refinery.fal.media/url/https%3A%2F%2Fstorage.googleapis.com%2Ffalserverless%2Fgallery%2Fjuggernaut-flux-lora.webp/tr:w-1920,q-80/juggernaut-flux-lora.webp)
Juggernaut Base Flux LoRA by RunDiffusion is a drop-in replacement for Flux [Dev] that delivers sharper details, richer colors, and enhanced realism to all your LoRAs and LyCORIS with full compatibility.