Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis.
recraft/v3/image-to-image
image-to-image

Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis.

vector
typography
style
Rodin by Hyper3D generates realistic and production ready 3D models from text or images.
hyper3d/rodin/v2
image-to-3d

Rodin by Hyper3D generates realistic and production ready 3D models from text or images.

text-to-3d
Transfer motion from a video to characters in an image using Dreamactor v2. Great performance for non-human and multiple characters
bytedance/dreamactor/v2
video-to-video

Transfer motion from a video to characters in an image using Dreamactor v2. Great performance for non-human and multiple characters

motion-control
dreamactor
MiniMax Hailuo-2.3 Text To Video API (Standard, 768p): Advanced text-to-video generation model with 768p resolution
minimax/hailuo-2.3/standard/text-to-video
text-to-video

MiniMax Hailuo-2.3 Text To Video API (Standard, 768p): Advanced text-to-video generation model with 768p resolution

Create natural HeyGen Avatar V digital twin videos from text or audio, with lip-sync, optional backgrounds, captions, and MP4/WebM output.
new
heygen/avatar5/digital-twin
text-to-video

Create natural HeyGen Avatar V digital twin videos from text or audio, with lip-sync, optional backgrounds, captions, and MP4/WebM output.

avatar
digital-twin
talking-avatar
Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
sadtalker
image-to-video

Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

animation
Text to Video endpoint for Seedance 1.0 Pro Fast, a next-generation video model designed to deliver maximum performance at minimal cost
bytedance/seedance/v1/pro/fast/text-to-video
text-to-video

Text to Video endpoint for Seedance 1.0 Pro Fast, a next-generation video model designed to deliver maximum performance at minimal cost

bytedance
fast
motion
Create stunningly realistic sound effects in seconds - CassetteAI's Sound Effects Model generates high-quality SFX up to 30 seconds long in just 1 second of processing time
cassetteai/sound-effects-generator
text-to-audio

Create stunningly realistic sound effects in seconds - CassetteAI's Sound Effects Model generates high-quality SFX up to 30 seconds long in just 1 second of processing time

sound
sfx
sound-effects
Generate video clips from your prompts using Kling 1.6 (pro)
kling-video/v1.6/pro/text-to-video
text-to-video

Generate video clips from your prompts using Kling 1.6 (pro)

Z-Image is the foundation model of the Z- Image family, engineered for good quality, robust generative diversity, broad stylistic coverage, and precise prompt adherence.
z-image/base
text-to-image

Z-Image is the foundation model of the Z- Image family, engineered for good quality, robust generative diversity, broad stylistic coverage, and precise prompt adherence.

z-image
base
Extend existing images with Ideogram V3's reframe feature. Create expanded versions and adaptations while preserving main image and adding new creative directions through prompt guidance.
ideogram/v3/reframe
image-to-image

Extend existing images with Ideogram V3's reframe feature. Create expanded versions and adaptations while preserving main image and adding new creative directions through prompt guidance.

realism
typography
Rodin by Hyper3D generates realistic and production ready 3D models from text or images.
hyper3d/rodin
image-to-3d

Rodin by Hyper3D generates realistic and production ready 3D models from text or images.

stylized
Super fast endpoint for the FLUX.1 [dev] inpainting model with LoRA support, enabling rapid and high-quality image inpaingting using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.
flux-lora/inpainting
text-to-image

Super fast endpoint for the FLUX.1 [dev] inpainting model with LoRA support, enabling rapid and high-quality image inpaingting using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.

lora
personalization
Wan-Animate is a video model that generates high-fidelity character videos by replicating the expressions and movements of characters from reference videos.
wan/v2.2-14b/animate/move
video-to-video

Wan-Animate is a video model that generates high-fidelity character videos by replicating the expressions and movements of characters from reference videos.

video to video
motion
FLUX General Image-to-Image is a versatile endpoint that transforms existing images with support for LoRA, ControlNet, and IP-Adapter extensions, enabling precise control over style transfer, modifications, and artistic variations through multiple guidance methods.
flux-general/image-to-image
image-to-image

FLUX General Image-to-Image is a versatile endpoint that transforms existing images with support for LoRA, ControlNet, and IP-Adapter extensions, enabling precise control over style transfer, modifications, and artistic variations through multiple guidance methods.

lora
controlnet
ip-adapter
Meshy-6-Preview is the latest model from Meshy. It generates realistic and production ready 3D models.
meshy/v6-preview/image-to-3d
image-to-3d

Meshy-6-Preview is the latest model from Meshy. It generates realistic and production ready 3D models.

An endpoint for re-lighting photos and changing their backgrounds per a given description
iclight-v2
image-to-image

An endpoint for re-lighting photos and changing their backgrounds per a given description

relighting
editing
Recraft V4.1 Utility is a faster, lighter variant of V4.1 made for high-volume creative workflows. Ideal for ideation, A/B exploration, and content pipelines, it keeps Recraft's design sensibility while optimizing for throughput and cost.
new
recraft/v4.1/utility/text-to-image
text-to-image

Recraft V4.1 Utility is a faster, lighter variant of V4.1 made for high-volume creative workflows. Ideal for ideation, A/B exploration, and content pipelines, it keeps Recraft's design sensibility while optimizing for throughput and cost.

stylized
transform
typography
Run SDXL at the speed of light
fast-sdxl/image-to-image
image-to-image

Run SDXL at the speed of light

diffusion
high-res
lora
Generate video clips from your prompts using Kling 2.0 Master
kling-video/v2/master/text-to-video
text-to-video

Generate video clips from your prompts using Kling 2.0 Master

Restore and enhance old or damaged photos by removing imperfections, adding color while preserving the original character and details of the image.
image-editing/photo-restoration
image-to-image

Restore and enhance old or damaged photos by removing imperfections, adding color while preserving the original character and details of the image.

stylized
transform
Generate synced sounds for any video, and return it with its new sound track (like MMAudio). Now up to 60 seconds!
new
mirelo-ai/sfx1.6/video-to-video
video-to-video

Generate synced sounds for any video, and return it with its new sound track (like MMAudio). Now up to 60 seconds!

sfx
Wan-2.2 turbo text-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts.
wan/v2.2-a14b/text-to-video/turbo
text-to-video

Wan-2.2 turbo text-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts.

text to video
motion
Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
stable-diffusion-v35-large
text-to-image

Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.

diffusion
typography
style
Generate speech with expressive and realistic voices from xAI
xai/tts/v1
text-to-speech

Generate speech with expressive and realistic voices from xAI

Start with a simple text input to create dynamic generations that defy expectations in up to 1080p. Experience better image clarity and crisper, sharper visuals.
pika/v2.2/text-to-video
text-to-video

Start with a simple text input to create dynamic generations that defy expectations in up to 1080p. Experience better image clarity and crisper, sharper visuals.

editing
effects
animation
Run Any Stable Diffusion model with customizable LoRA weights.
lora
text-to-image

Run Any Stable Diffusion model with customizable LoRA weights.

diffusion
lora
customization
FLUX1.1 [pro] ultra Redux is a high-performance endpoint for the FLUX1.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
flux-pro/v1.1-ultra/redux
image-to-image

FLUX1.1 [pro] ultra Redux is a high-performance endpoint for the FLUX1.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

style transfer
high-res
Showing 309 to 336 of 1354 results