Framepack is an efficient Image-to-video model that autoregressively generates videos.
framepack
image-to-video

Framepack is an efficient Image-to-video model that autoregressively generates videos.

image to video
motion
Get EBU R128 loudness normalization from audio files using FFmpeg API.
ffmpeg-api/loudnorm
json

Get EBU R128 loudness normalization from audio files using FFmpeg API.

ffmpeg
Generate 3D human motions via text-to-generation interface of Hunyuan Motion!
hunyuan-motion/fast
text-to-3d

Generate 3D human motions via text-to-generation interface of Hunyuan Motion!

motion
Turn images into pixel-perfect retro art
image2pixel
image-to-image

Turn images into pixel-perfect retro art

post-processing
pixel-art
Bagel is a 7B parameter from Bytedance-Seed multimodal model that can generate both text and images.
bagel
text-to-image

Bagel is a 7B parameter from Bytedance-Seed multimodal model that can generate both text and images.

multimodal
Wan-2.2 video-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts and source videos.
wan/v2.2-a14b/video-to-video
video-to-video

Wan-2.2 video-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts and source videos.

Fine-tune FLUX.2 [klein] 9B from Black Forest Labs with custom datasets. Create specialized LoRA adaptations for specific editing tasks.
flux-2-klein-9b-base-trainer/edit
training

Fine-tune FLUX.2 [klein] 9B from Black Forest Labs with custom datasets. Create specialized LoRA adaptations for specific editing tasks.

Generate professional, eCommerce-ready product shots by replacing backgrounds with realistic lighting and accurate perspective from a simple text prompt. Trained exclusively on licensed data for safe commercial use.
bria/replace-background
image-to-image

Generate professional, eCommerce-ready product shots by replacing backgrounds with realistic lighting and accurate perspective from a simple text prompt. Trained exclusively on licensed data for safe commercial use.

bria
replace-background
Realtime generation with FLUX.2 [klein] from Black Forest Labs.
flux-2/klein/realtime
image-to-image

Realtime generation with FLUX.2 [klein] from Black Forest Labs.

realtime
Generate long videos from prompts and images using LTX Video-0.9.8 13B Distilled and custom LoRA
ltxv-13b-098-distilled/image-to-video
image-to-video

Generate long videos from prompts and images using LTX Video-0.9.8 13B Distilled and custom LoRA

video
ltx-video
Vector font generation with VecGlypher. Create custom glyphs from text descriptions or reference images—outputs clean SVG paths directly without raster-to-vector conversion.
vecglypher/image-to-svg
image-to-image

Vector font generation with VecGlypher. Create custom glyphs from text descriptions or reference images—outputs clean SVG paths directly without raster-to-vector conversion.

Generate 3D models from text descriptions using Tripo H3.1.
tripo3d/h3.1/text-to-3d
text-to-3d

Generate 3D models from text descriptions using Tripo H3.1.

3d
3d-generation
tripo
FLUX.1 SRPO [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.
flux-1/srpo
text-to-image

FLUX.1 SRPO [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.

AI vectorization model that transforms raster images into scalable SVG graphics, preserving visual details while enabling infinite scaling and easy editing capabilities.
star-vector
image-to-image

AI vectorization model that transforms raster images into scalable SVG graphics, preserving visual details while enabling infinite scaling and easy editing capabilities.

Reimagine and transform your ordinary photos into enchanting Studio Ghibli style artwork
ghiblify
image-to-image

Reimagine and transform your ordinary photos into enchanting Studio Ghibli style artwork

stylized
transform
MoonDreamNext Detection is a multimodal vision-language model for gaze detection, bbox detection, point detection, and more.
moondream-next/detection
image-to-image

MoonDreamNext Detection is a multimodal vision-language model for gaze detection, bbox detection, point detection, and more.

multimodal
Hunyuan World 1.0 turns a single image into a panorama or a 3D world. It creates realistic scenes from the image, allowing you to explore and view it from different angles.
hunyuan_world/image-to-world
image-to-3d

Hunyuan World 1.0 turns a single image into a panorama or a 3D world. It creates realistic scenes from the image, allowing you to explore and view it from different angles.

Experiment with different hairstyles, from bald to any style you can imagine, while maintaining natural lighting and realistic results.
image-editing/hair-change
image-to-image

Experiment with different hairstyles, from bald to any style you can imagine, while maintaining natural lighting and realistic results.

stylized
transform
Generate Images with ControlNet.
fast-sdxl-controlnet-canny
text-to-image

Generate Images with ControlNet.

diffusion
controlnet
manipulation
Use the latest Vidu Q2 models which much more better quality and control on your videos.
vidu/q2/image-to-video/turbo
image-to-video

Use the latest Vidu Q2 models which much more better quality and control on your videos.

An efficent SDXL multi-controlnet text-to-image model.
sdxl-controlnet-union
text-to-image

An efficent SDXL multi-controlnet text-to-image model.

diffusion
controlnet
composition
Optimize 3D mesh topology with Hunyuan 3D Smart Topology.
hunyuan-3d/v3.1/smart-topology
3d-to-3d

Optimize 3D mesh topology with Hunyuan 3D Smart Topology.

3d
hunyuan
topology
Generate synced sounds for any video, and return the new sound track (like MMAudio)
mirelo-ai/sfx-v1/video-to-audio
video-to-audio

Generate synced sounds for any video, and return the new sound track (like MMAudio)

sfx
LongCat-Video-Avatar is an audio-driven video generation model that can generates super-realistic, lip-synchronized long video generation with natural dynamics and consistent identity.
longcat-single-avatar/image-audio-to-video
audio-to-video

LongCat-Video-Avatar is an audio-driven video generation model that can generates super-realistic, lip-synchronized long video generation with natural dynamics and consistent identity.

image-to-video
All-in-one image AI with JoyAI-Image. Understand, create, and edit images through natural language—the model's deep visual understanding powers more accurate generation and precise editing in a unified system.
joyai-image-edit
image-to-image

All-in-one image AI with JoyAI-Image. Understand, create, and edit images through natural language—the model's deep visual understanding powers more accurate generation and precise editing in a unified system.

image-editing
FLUX Control LoRA Canny is a high-performance endpoint that uses a control image using a Canny edge map to transfer structure to the generated image and another initial image to guide color.
flux-control-lora-canny/image-to-image
image-to-image

FLUX Control LoRA Canny is a high-performance endpoint that uses a control image using a Canny edge map to transfer structure to the generated image and another initial image to guide color.

lora
style transfer
Interpolate between video frames
amt-interpolation
video-to-video

Interpolate between video frames

interpolation
editing
Line art preprocessor.
image-preprocessors/lineart
image-to-image

Line art preprocessor.

preprocess
utility
sketch
Showing 701 to 728 of 1354 results