
Easily adjust the perspective of any image to different angles.

RunDiffusion Photo Flux provides insane realism. With this enhancer, textures and skin details burst to life, turning your favorite prompts into vivid, lifelike creations. Recommended to keep it at 0.65 to 0.80 weight. Supports resolutions up to 1536x1536.

Place products naturally in a person’s hands for realistic marketing visuals.

Adjust color temperature, brightness, contrast, saturation, and gamma values for color correction.

VACE Fun for Wan 2.2 A14B from Alibaba-PAI

Qwen Image 2512 LoRA training

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

An AI model that transforms input images into new ones based on text prompts, blending reference visuals with your creative directions.

Pixel-Aware Diffusion Model for Realistic Image Super-Resolution and Personalized Stylization

Clone your voices using Qwen3-TTS Clone-Voice model with zero shot cloning capabilities and use it on text-to-speech models to create speeches of yours!

Create seamless transition between images using PixVerse v3.5

Generate video clips more accurately with respect to initial image, natural language descriptions, and using camera movement instructions for shot control.

Image-to-image editing with Step1X-Edit v2 from StepFun. Reasoning-enhanced modifications through a thinking–editing–reflection loop with MLLM world knowledge for abstract instruction comprehension.
![FLUX.1 Krea [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.](https://refinery.fal.media/url/https%3A%2F%2Fstorage.googleapis.com%2Ffal_cdn%2Ffal%2FTraining-5.jpg/tr:w-1920,q-80/Training-5.webp)
FLUX.1 Krea [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.

Expressive facial performance, natural speech-expression coordination, realistic body motion, and accurate audio-video synchronization with DaVinci-MagiHuman model

Image colorization and color-grading model. Bring color to black-and-white photos or apply curated color treatments using simple style-based commands.

Extends videos with audio using LTX-2

Generate character ids to use with Sora 2 generations

Retouch photos of faces. Remove blemishes and improve the skin.

Generate YouTube thumbnails with custom text

Generate fast high quality video clips from text and image prompts using PixVerse v4.5

Enhance and refine portrait photos with improved clarity and detail.

VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.

EchoMimic V3 generates a talking avatar model from a picture, audio and text prompt.

High-fidelity mask-based video object removal with strong temporal consistency. Erase unwanted objects, people, or elements while preserving aesthetic quality. Trained on licensed data for risk-free commercial use.

Generate a video starting from an image as the first frame with Marey, a generative video model trained exclusively on fully licensed data.

MoonDreamNext Batch is a multimodal vision-language model for batch captioning.

Generate high quality images from text prompts using CogView4. Longer text prompts will result in better quality images.