Split 3D models into parts with Hunyuan 3D
hunyuan-3d/v3.1/part
3d-to-3d

Split 3D models into parts with Hunyuan 3D

3d
hunyuan
mesh
Qwen-Image-Layered is a model capable of decomposing an image into multiple RGBA layers. Use loras to get your custom outputs.
qwen-image-layered/lora
image-to-image

Qwen-Image-Layered is a model capable of decomposing an image into multiple RGBA layers. Use loras to get your custom outputs.

qwen
lora
Run SDXL at the speed of light
fast-sdxl/inpainting
image-to-image

Run SDXL at the speed of light

diffusion
high-res
lora
Erase and replace any moment in your audio with AI-driven precision.
new
mirelo-ai/sfx1.6/inpaint-audio
audio-to-audio

Erase and replace any moment in your audio with AI-driven precision.

sfx
Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
pixart-sigma
text-to-image

Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

diffusion
HiDream-I1 full is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.
hidream-i1-full/image-to-image
image-to-image

HiDream-I1 full is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.

hidream
Heygen Text to Video Generation Model
heygen/v2/video-agent
text-to-video

Heygen Text to Video Generation Model

Run SDXL at the speed of light
fast-lcm-diffusion/image-to-image
image-to-image

Run SDXL at the speed of light

lcm
diffusion
turbo
Vidu Q1 Start-End to Video generates smooth transition 1080p videos between specified start and end images.
vidu/q1/start-end-to-video
image-to-video

Vidu Q1 Start-End to Video generates smooth transition 1080p videos between specified start and end images.

stylized
transform
Cohere Transcribe turns your business audio into accurate text, ready for search, analytics, and automation
cohere-transcribe
speech-to-text

Cohere Transcribe turns your business audio into accurate text, ready for search, analytics, and automation

speech
transcribe
stt
Use the latest pixverse v5.6 model to turn your texts and images into amazing videos.
pixverse/v5.6/transition
image-to-video

Use the latest pixverse v5.6 model to turn your texts and images into amazing videos.

Vidu Reference to Video creates videos by using a reference images and combining them with a prompt.
vidu/reference-to-video
image-to-video

Vidu Reference to Video creates videos by using a reference images and combining them with a prompt.

motion
reference
Clone dialog voices from a sample audio and generate dialogs from text prompts using the Dia TTS which leverages advanced AI techniques to create high-quality text-to-speech.
dia-tts/voice-clone
audio-to-audio

Clone dialog voices from a sample audio and generate dialogs from text prompts using the Dia TTS which leverages advanced AI techniques to create high-quality text-to-speech.

speech
High-quality text-to-image model by Baidu. Supports English, Chinese, and Japanese prompts with built-in prompt expansion.
ernie-image/lora/turbo
text-to-image

High-quality text-to-image model by Baidu. Supports English, Chinese, and Japanese prompts with built-in prompt expansion.

stylized
transform
typography
Image generation with BitDance. Fast, high-resolution photorealistic images using an autoregressive LLM— for efficient, high-quality results.
bitdance
text-to-image

Image generation with BitDance. Fast, high-resolution photorealistic images using an autoregressive LLM— for efficient, high-quality results.

Any pose, any style, any identity
omni-zero
image-to-image

Any pose, any style, any identity

style transfer
Stable Avatar generates audio-driven video avatars up to five minutes long
stable-avatar
audio-to-video

Stable Avatar generates audio-driven video avatars up to five minutes long

stable-avatar
talking-head
Generate long videos from prompts using LTX Video-0.9.8 13B Distilled and custom LoRA
ltxv-13b-098-distilled
text-to-video

Generate long videos from prompts using LTX Video-0.9.8 13B Distilled and custom LoRA

video
ltx-video
A model for high quality and smooth background removal for videos.
ben/v2/video
video-to-video

A model for high quality and smooth background removal for videos.

segmentation
background removal
Open, efficient reasoning model from NVIDIA. 30B A3B hybrid Transformer-Mamba MoE, built for enterprise agentic workflows.
new
nvidia/nemotron-3-nano-omni
llm

Open, efficient reasoning model from NVIDIA. 30B A3B hybrid Transformer-Mamba MoE, built for enterprise agentic workflows.

nemotron
nvidia
reasoning
Remove unwanted objects from images with a text prompt - fast, precise editing that seamlessly blends results. Built for production scale and trained on licensed data for safe commercial use.
bria/fibo-edit/erase_by_text
image-to-image

Remove unwanted objects from images with a text prompt - fast, precise editing that seamlessly blends results. Built for production scale and trained on licensed data for safe commercial use.

bria
fibo-edit
prompt-eraser
Edit videos using instruction-based prompting using Editto model!
editto
video-to-video

Edit videos using instruction-based prompting using Editto model!

video-edit
wan-vace
Generate 3D models from your images using Trellis 2. A native 3D generative model enabling versatile and high-quality 3D asset creation.
trellis-2/retexture
image-to-3d

Generate 3D models from your images using Trellis 2. A native 3D generative model enabling versatile and high-quality 3D asset creation.

image-to-3d
Add a background to images with white/clean background
flux-2-lora-gallery/add-background
image-to-image

Add a background to images with white/clean background

stylized
transform
Generate video with audio from images using LTX-2.3 and custom LoRA
ltx-2.3-22b/image-to-video/lora
image-to-video

Generate video with audio from images using LTX-2.3 and custom LoRA

Leffa Pose Transfer is an endpoint for changing pose of an image with a reference image.
leffa/pose-transfer
image-to-image

Leffa Pose Transfer is an endpoint for changing pose of an image with a reference image.

pose
utility
Use USO to perform subject driven generations using reference image.
uso
image-to-image

Use USO to perform subject driven generations using reference image.

A unified paradigm for audio-video generation
ovi
text-to-video

A unified paradigm for audio-video generation

Showing 897 to 924 of 1354 results