Create seamless transition between images using PixVerse v4.5
pixverse/v4.5/transition
image-to-video

Create seamless transition between images using PixVerse v4.5

stylized
transform
Generate music from a lyrics and example audio using ACE-Step
ace-step/audio-to-audio
audio-to-audio

Generate music from a lyrics and example audio using ACE-Step

audio-edit
Design a personalized voice from a text description, and generate speech from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality text-to-speech.
minimax/voice-design
text-to-speech

Design a personalized voice from a text description, and generate speech from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality text-to-speech.

speech
Adjust and enhance videos with Ray-2 Reframe. This advanced tool seamlessly reframes videos to your desired aspect ratio, intelligently inpainting missing regions to ensure realistic visuals and coherent motion, delivering exceptional quality and creative flexibility.
luma-dream-machine/ray-2-flash/reframe
video-to-video

Adjust and enhance videos with Ray-2 Reframe. This advanced tool seamlessly reframes videos to your desired aspect ratio, intelligently inpainting missing regions to ensure realistic visuals and coherent motion, delivering exceptional quality and creative flexibility.

reframe
outpaint
flash
Generate video with audio from text using LTX-2 Distilled
ltx-2-19b/distilled/text-to-video
text-to-video

Generate video with audio from text using LTX-2 Distilled

Inpaint images with SD and SDXL
inpaint
image-to-image

Inpaint images with SD and SDXL

editing
diffusion
Generate high-quality images from depth maps using Flux.1 [dev] depth estimation model. The model produces accurate depth representations for scene understanding and 3D visualization.
flux-lora-depth
image-to-image

Generate high-quality images from depth maps using Flux.1 [dev] depth estimation model. The model produces accurate depth representations for scene understanding and 3D visualization.

depth
lora
utility
Hunyuan Video 1.5 is Tencent's latest and best video model
hunyuan-video-v1.5/image-to-video
image-to-video

Hunyuan Video 1.5 is Tencent's latest and best video model

AuraFlow v0.3 is an open-source flow-based text-to-image generation model that achieves state-of-the-art results on GenEval. The model is currently in beta.
aura-flow
text-to-image

AuraFlow v0.3 is an open-source flow-based text-to-image generation model that achieves state-of-the-art results on GenEval. The model is currently in beta.

typography
style
Generate text embeddings using OpenAI-compatible API. Access embedding models like text-embedding-3-small, text-embedding-3-large (OpenAI), and other embedding models available through OpenRouter. Drop-in replacement for the OpenAI embeddings API. Powered by OpenRouter.
openrouter/router/openai/v1/embeddings
llm

Generate text embeddings using OpenAI-compatible API. Access embedding models like text-embedding-3-small, text-embedding-3-large (OpenAI), and other embedding models available through OpenRouter. Drop-in replacement for the OpenAI embeddings API. Powered by OpenRouter.

Generate images from text, an image and a mask using Z-Image Turbo, Tongyi-MAI's super-fast 6B model.
z-image/turbo/inpaint
image-to-image

Generate images from text, an image and a mask using Z-Image Turbo, Tongyi-MAI's super-fast 6B model.

inpainting
Wan-2.1 flf2v generates dynamic videos by intelligently bridging a given first frame to a desired end frame through smooth, coherent motion sequences.
wan-flf2v
image-to-video

Wan-2.1 flf2v generates dynamic videos by intelligently bridging a given first frame to a desired end frame through smooth, coherent motion sequences.

image to video
motion
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
florence-2-large/caption-to-phrase-grounding
image-to-image

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

multimodal
vision
FLUX.1 [dev] Redux is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
flux/dev/redux
image-to-image

FLUX.1 [dev] Redux is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

Accelerated image generation with Ideogram V2A Turbo. Create high-quality visuals, posters, and logos with enhanced speed while maintaining Ideogram's signature quality.
ideogram/v2a/turbo
text-to-image

Accelerated image generation with Ideogram V2A Turbo. Create high-quality visuals, posters, and logos with enhanced speed while maintaining Ideogram's signature quality.

realism
typography
Imagen3 Fast is a high-quality text-to-image model that generates realistic images from text prompts.
imagen3/fast
text-to-image

Imagen3 Fast is a high-quality text-to-image model that generates realistic images from text prompts.

Run Any Stable Diffusion model with customizable LoRA weights.
lora/image-to-image
image-to-image

Run Any Stable Diffusion model with customizable LoRA weights.

diffusion
lora
customization
Kling O1 Omni generates new shots guided by an input reference video, preserving cinematic language such as motion, and camera style to produce seamless scene continuity.
kling-video/o1/standard/video-to-video/reference
video-to-video

Kling O1 Omni generates new shots guided by an input reference video, preserving cinematic language such as motion, and camera style to produce seamless scene continuity.

Meshy-6-Preview is the latest model from Meshy. It generates realistic and production ready 3D models.
meshy/v6-preview/text-to-3d
text-to-3d

Meshy-6-Preview is the latest model from Meshy. It generates realistic and production ready 3D models.

Enhance facial features with professional retouching while maintaining a natural, realistic look
image-editing/face-enhancement
image-to-image

Enhance facial features with professional retouching while maintaining a natural, realistic look

stylized
transform
Pixverse Transition
pixverse/v5.5/transition
image-to-video

Pixverse Transition

Juggernaut Pro Flux by RunDiffusion is the flagship Juggernaut model rivaling some of the most advanced image models available, often surpassing them in realism. It combines Juggernaut Base with RunDiffusion Photo and features enhancements like reduced background blurriness.
rundiffusion-fal/juggernaut-flux/pro
text-to-image

Juggernaut Pro Flux by RunDiffusion is the flagship Juggernaut model rivaling some of the most advanced image models available, often surpassing them in realism. It combines Juggernaut Base with RunDiffusion Photo and features enhancements like reduced background blurriness.

image generation
Adjust and enhance videos with Ray-2 Reframe. This advanced tool seamlessly reframes videos to your desired aspect ratio, intelligently inpainting missing regions to ensure realistic visuals and coherent motion, delivering exceptional quality and creative flexibility.
luma-dream-machine/ray-2/reframe
video-to-video

Adjust and enhance videos with Ray-2 Reframe. This advanced tool seamlessly reframes videos to your desired aspect ratio, intelligently inpainting missing regions to ensure realistic visuals and coherent motion, delivering exceptional quality and creative flexibility.

reframe
outpaint
FLUX Control LoRA Depth is a high-performance endpoint that uses a control image to transfer structure to the generated image, using a depth map.
flux-control-lora-depth
text-to-image

FLUX Control LoRA Depth is a high-performance endpoint that uses a control image to transfer structure to the generated image, using a depth map.

lora
style transfer
Stable Diffusion 3.5 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
stable-diffusion-v35-medium
text-to-image

Stable Diffusion 3.5 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.

diffusion
typography
style
Virtually furnishes an empty apartment
flux-2-lora-gallery/apartment-staging
image-to-image

Virtually furnishes an empty apartment

stylized
transform
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
florence-2-large/ocr-with-region
image-to-image

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

ocr
multimodal
vision
Change hairstyles and hair colors in photos realistically.
image-apps-v2/hair-change
image-to-image

Change hairstyles and hair colors in photos realistically.

hair-edit
style-change
Showing 589 to 616 of 1354 results