Generate speech from text prompts and different voices using the Kling TTS model, which leverages advanced AI techniques to create high-quality text-to-speech.
kling-video/v1/tts
text-to-speech

Generate speech from text prompts and different voices using the Kling TTS model, which leverages advanced AI techniques to create high-quality text-to-speech.

audio
Extract seamless tiling textures with PBR attribute maps from images
patina/material/extract
image-to-image

Extract seamless tiling textures with PBR attribute maps from images

material
pbr
extraction
Accelerated image generation with Ideogram V2 Turbo. Create high-quality visuals, posters, and logos with enhanced speed while maintaining Ideogram's signature quality.
ideogram/v2/turbo
text-to-image

Accelerated image generation with Ideogram V2 Turbo. Create high-quality visuals, posters, and logos with enhanced speed while maintaining Ideogram's signature quality.

realism
typography
Generate synced sounds for any video, and return the new sound track (like MMAudio)
mirelo-ai/sfx-v1.5/video-to-audio
video-to-audio

Generate synced sounds for any video, and return the new sound track (like MMAudio)

sfx
State-of-the-art open-source model in aesthetic quality
playground-v25
text-to-image

State-of-the-art open-source model in aesthetic quality

artistic
style
FLUX.1 [schnell] Redux is a high-performance endpoint for the FLUX.1 [schnell] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
flux/schnell/redux
image-to-image

FLUX.1 [schnell] Redux is a high-performance endpoint for the FLUX.1 [schnell] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

style transfer
MultiTalk model generates a talking avatar video from an image and audio file. The avatar lip-syncs to the provided audio with natural facial expressions.
ai-avatar
image-to-video

MultiTalk model generates a talking avatar video from an image and audio file. The avatar lip-syncs to the provided audio with natural facial expressions.

stylized
transform
Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. Use the first tts from resemble ai.
chatterbox/speech-to-speech
speech-to-speech

Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. Use the first tts from resemble ai.

Create high-fidelity video with audio from text with LTX-2 Pro.
ltx-2/text-to-video
text-to-video

Create high-fidelity video with audio from text with LTX-2 Pro.

Transform your photos into vibrant cool cartoons with bold outlines and rich colors.
image-editing/cartoonify
image-to-image

Transform your photos into vibrant cool cartoons with bold outlines and rich colors.

stylized
transform
LTX-2.3 is a high-quality, fast AI video model available in Pro and Fast variants for text-to-video, image-to-video, and audio-to-video.
ltx-2.3/extend-video
video-to-video

LTX-2.3 is a high-quality, fast AI video model available in Pro and Fast variants for text-to-video, image-to-video, and audio-to-video.

stylized
transform
lipsync
Wan-2.1 Pro is a premium image-to-video model that generates high-quality 1080p videos at 30fps with up to 6 seconds duration, delivering exceptional visual quality and motion diversity from images
wan-pro/image-to-video
image-to-video

Wan-2.1 Pro is a premium image-to-video model that generates high-quality 1080p videos at 30fps with up to 6 seconds duration, delivering exceptional visual quality and motion diversity from images

image to video
motion
Dia directly generates realistic dialogue from transcripts. Audio conditioning enables emotion control. Produces natural nonverbals like laughter and throat clearing.
dia-tts
text-to-speech

Dia directly generates realistic dialogue from transcripts. Audio conditioning enables emotion control. Produces natural nonverbals like laughter and throat clearing.

Generates same object from different angles (azimuth/elevation)
flux-2-lora-gallery/multiple-angles
image-to-image

Generates same object from different angles (azimuth/elevation)

stylized
transform
Generate video with audio from text using LTX-2
ltx-2-19b/text-to-video
text-to-video

Generate video with audio from text using LTX-2

Makes images more photorealistic and natural
flux-2-lora-gallery/realism
text-to-image

Makes images more photorealistic and natural

stylized
transform
Generate video clips from your multiple image references using Kling 1.6 (standard)
kling-video/v1.6/standard/elements
image-to-video

Generate video clips from your multiple image references using Kling 1.6 (standard)

Bring colors into old or new black and white photos with DDColor.
ddcolor
image-to-image

Bring colors into old or new black and white photos with DDColor.

image-recolorization
faces
utility
A fal.ai endpoint that stitches an ordered list of images into an MP4 video by holding each image for a specified number of frames at a configurable frame rate
new
ffmpeg-api/images-to-video
image-to-video

A fal.ai endpoint that stitches an ordered list of images into an MP4 video by holding each image for a specified number of frames at a configurable frame rate

utility
editing
Generate video prompts using a variety of techniques including camera direction, style, pacing, special effects and more.
video-prompt-generator
llm

Generate video prompts using a variety of techniques including camera direction, style, pacing, special effects and more.

motion
transformation
chat
Use React-1 from SyncLabs to refine human emotions and do realistic lip-sync without losing details!
sync-lipsync/react-1
video-to-video

Use React-1 from SyncLabs to refine human emotions and do realistic lip-sync without losing details!

lipsync
Stable Diffusion v1.5
stable-diffusion-v15
text-to-image

Stable Diffusion v1.5

diffusion
Ovi can generate videos with audio from image and text inputs.
ovi/image-to-video
image-to-video

Ovi can generate videos with audio from image and text inputs.

image-to-audio-video
Qwen-Image (Image-to-Image) transforms and edits input images with high fidelity, enabling precise style transfer, enhancement, and creative modification.
qwen-image/image-to-image
image-to-image

Qwen-Image (Image-to-Image) transforms and edits input images with high fidelity, enabling precise style transfer, enhancement, and creative modification.

Generate 3D human motions via text-to-generation interface of Hunyuan Motion!
hunyuan-motion
text-to-3d

Generate 3D human motions via text-to-generation interface of Hunyuan Motion!

motion
Nemotron-Labs-Diffusion-VLM-8B is the vision-language extension of the Nemotron-Labs-Diffusion family.
new
nemotron-diffusion-vlm
vision

Nemotron-Labs-Diffusion-VLM-8B is the vision-language extension of the Nemotron-Labs-Diffusion family.

utility
editing
Generate high quality video clips with different effects using PixVerse v5
pixverse/v5/effects
image-to-video

Generate high quality video clips with different effects using PixVerse v5

Leffa Virtual TryOn is a high quality image based Try-On endpoint which can be used for commercial try on.
leffa/virtual-tryon
image-to-image

Leffa Virtual TryOn is a high quality image based Try-On endpoint which can be used for commercial try on.

try-on
fashion
clothing
Showing 561 to 588 of 1354 results