Use SeedVR2 to upscale images, retaining seamless tiling
seedvr/upscale/image/seamless
image-to-image

Use SeedVR2 to upscale images, retaining seamless tiling

upscale
seamless
tiling
Generate speech from text prompts and different voices using the MiniMax Speech-2.8 Turbo model, which leverages advanced AI techniques to create high-quality text-to-speech.
minimax/speech-2.8-turbo
text-to-speech

Generate speech from text prompts and different voices using the MiniMax Speech-2.8 Turbo model, which leverages advanced AI techniques to create high-quality text-to-speech.

MuseTalk is a real-time high quality audio-driven lip-syncing model. Use MuseTalk to animate a face with your own audio.
musetalk
image-to-video

MuseTalk is a real-time high quality audio-driven lip-syncing model. Use MuseTalk to animate a face with your own audio.

animation
lip sync
real-time
 Smart image resize to arbitrary dimensions, powered by Nano Banana Pro with vision-LLM-guided prompting for composition-aware recomposition. Crop, cropping, resize ads.
new
smart-resize
image-to-image

Smart image resize to arbitrary dimensions, powered by Nano Banana Pro with vision-LLM-guided prompting for composition-aware recomposition. Crop, cropping, resize ads.

realism
typography
visual
Image-to-image editing with FLUX.2 [klein] 9B from Black Forest Labs and custom LoRA. Precise modifications using natural language descriptions and hex color control.
flux-2/klein/9b/edit/lora
image-to-image

Image-to-image editing with FLUX.2 [klein] 9B from Black Forest Labs and custom LoRA. Precise modifications using natural language descriptions and hex color control.

FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use.
flux-1/dev
text-to-image

FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use.

Remove background from any video with people and objects. No green screen needed.
veed/video-background-removal/fast
video-to-video

Remove background from any video with people and objects. No green screen needed.

Generate video clips from your prompts using Kling 1.0
kling-video/v1/standard/text-to-video
text-to-video

Generate video clips from your prompts using Kling 1.0

motion
Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. Use the first tts from resemble ai.
chatterbox/text-to-speech
text-to-speech

Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. Use the first tts from resemble ai.

Automatically remove backgrounds from videos -perfect for creating clean, professional content without a green screen.
bria/video/background-removal
video-to-video

Automatically remove backgrounds from videos -perfect for creating clean, professional content without a green screen.

background-removal
FLUX.1 [dev] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
flux-lora-fill
image-to-image

FLUX.1 [dev] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

editing
lora
Wan-2.2 image-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts and images. This endpoint supports LoRAs made for Wan 2.2
wan/v2.2-a14b/image-to-video/lora
image-to-video

Wan-2.2 image-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts and images. This endpoint supports LoRAs made for Wan 2.2

motion
lora
Finegrain Eraser removes any object selected with a mask—along with its shadows, reflections, and lighting artifacts—seamlessly reconstructing the scene with contextually accurate content.
finegrain-eraser/mask
image-to-image

Finegrain Eraser removes any object selected with a mask—along with its shadows, reflections, and lighting artifacts—seamlessly reconstructing the scene with contextually accurate content.

utility
editing
Extend Veo-Created Videos up to 30 seconds
veo3.1/extend-video
video-to-video

Extend Veo-Created Videos up to 30 seconds

extend-video
Generate video clips from your prompts using MiniMax model
minimax/video-01
text-to-video

Generate video clips from your prompts using MiniMax model

motion
transformation
SOTA stemming model for voice, drums, bass, guitar and more.
demucs
audio-to-audio

SOTA stemming model for voice, drums, bass, guitar and more.

audio
VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video
veed/fabric-1.0/fast
image-to-video

VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video

lipsync
avatar
SAM 2 is a model for segmenting images automatically. It can return individual masks or a single mask for the entire image.
sam2/auto-segment
image-to-image

SAM 2 is a model for segmenting images automatically. It can return individual masks or a single mask for the entire image.

segmentation
mask
Generate high quality video clips from text and image prompts using PixVerse v5
pixverse/v5/image-to-video
image-to-video

Generate high quality video clips from text and image prompts using PixVerse v5

stylized
transform
Inpainting Endpoint for the Qwen Edit Image editing model.
qwen-image-edit/inpaint
image-to-image

Inpainting Endpoint for the Qwen Edit Image editing model.

inpainting
qwen-image
FLUX General Inpainting is a versatile endpoint that enables precise image editing and completion, supporting multiple AI extensions including LoRA, ControlNet, and IP-Adapter for enhanced control over inpainting results and sophisticated image modifications.
flux-general/inpainting
image-to-image

FLUX General Inpainting is a versatile endpoint that enables precise image editing and completion, supporting multiple AI extensions including LoRA, ControlNet, and IP-Adapter for enhanced control over inpainting results and sophisticated image modifications.

lora
controlnet
ip-adapter
Image-to-image editing with FLUX.2 [klein] 4B Base from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.
flux-2/klein/4b/base/edit
image-to-image

Image-to-image editing with FLUX.2 [klein] 4B Base from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.

Unified image generation with HiDream-O1-Image. Create, edit, and personalize high-resolution images up to 2K—single native model handles text-to-image, editing, and custom subjects without external components.
new
hidream-o1-image
text-to-image

Unified image generation with HiDream-O1-Image. Create, edit, and personalize high-resolution images up to 2K—single native model handles text-to-image, editing, and custom subjects without external components.

Generate high quality video clips by swapping person, objects and background using Pixverse Swap.
pixverse/swap
image-to-video

Generate high quality video clips by swapping person, objects and background using Pixverse Swap.

FFMPEG Utility for Trim Video
workflow-utilities/trim-video
video-to-video

FFMPEG Utility for Trim Video

The video upscaler endpoint uses RealESRGAN on each frame of the input video to upscale the video to a higher resolution.
video-upscaler
video-to-video

The video upscaler endpoint uses RealESRGAN on each frame of the input video to upscale the video to a higher resolution.

video generation
video to video
ai video
Generate music from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality, diverse musical compositions.
minimax-music
text-to-audio

Generate music from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality, diverse musical compositions.

music
FASHN v1.5 delivers precise virtual try-on capabilities, accurately rendering garment details like text and patterns at 576x864 resolution from both on-model and flat-lay photo references.
fashn/tryon/v1.5
image-to-image

FASHN v1.5 delivers precise virtual try-on capabilities, accurately rendering garment details like text and patterns at 576x864 resolution from both on-model and flat-lay photo references.

try-on
fashion
clothing
Showing 337 to 364 of 1354 results