Image-to-image editing with LoRA support for FLUX.2 [klein] 9B Base from Black Forest Labs. Specialized style transfer and domain-specific modifications.
flux-2/klein/9b/base/edit/lora
image-to-image

Image-to-image editing with LoRA support for FLUX.2 [klein] 9B Base from Black Forest Labs. Specialized style transfer and domain-specific modifications.

Generate high-quality 3D models from a single image using Tripo H3.1.
tripo3d/h3.1/image-to-3d
image-to-3d

Generate high-quality 3D models from a single image using Tripo H3.1.

3d
3d-generation
tripo
Image-to-image editing with LoRA support for FLUX.2 [dev] from Black Forest Labs. Specialized style transfer and domain-specific modifications.
flux-2/lora/edit
image-to-image

Image-to-image editing with LoRA support for FLUX.2 [dev] from Black Forest Labs. Specialized style transfer and domain-specific modifications.

Generate videos from prompts using LTX Video
ltx-video
text-to-video

Generate videos from prompts using LTX Video

Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization with PixVerse Lipsync model
pixverse/lipsync
video-to-video

Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization with PixVerse Lipsync model

animation
lip sync
MiniMax Hailuo-2.3-Fast Image To Video API (Pro, 1080p): Advanced fast image-to-video generation model with 1080p resolution
minimax/hailuo-2.3-fast/pro/image-to-video
image-to-video

MiniMax Hailuo-2.3-Fast Image To Video API (Pro, 1080p): Advanced fast image-to-video generation model with 1080p resolution

Create blazing fast and economical videos with MiniMax Hailuo-02 Image To Video API at 512p resolution
minimax/hailuo-02-fast/image-to-video
image-to-video

Create blazing fast and economical videos with MiniMax Hailuo-02 Image To Video API at 512p resolution

stylized
transform
Generate a video by taking a start frame and an end frame, animating the transition between them while following text-driven style and scene guidance.
kling-video/o1/standard/image-to-video
image-to-video

Generate a video by taking a start frame and an end frame, animating the transition between them while following text-driven style and scene guidance.

Enhances a given raster image using the 'creative upscale' tool, increasing image resolution, making the image sharper and cleaner.
recraft/upscale/creative
image-to-image

Enhances a given raster image using the 'creative upscale' tool, increasing image resolution, making the image sharper and cleaner.

upscaling
Generate high quality images from text prompts using MiniMax Image-01. Longer text prompts will result in better quality images.
minimax/image-01
text-to-image

Generate high quality images from text prompts using MiniMax Image-01. Longer text prompts will result in better quality images.

stylized
realism
Recraft V4 was developed with designers to bring true visual taste to AI image generation. Built for brand systems and production-ready workflows, it goes beyond prompt accuracy — delivering stronger composition, refined lighting, realistic materials, and a cohesive aesthetic. The result is imagery shaped by professional design judgment, ready for immediate real-world use without additional post-processing.
recraft/v4/text-to-vector
text-to-image

Recraft V4 was developed with designers to bring true visual taste to AI image generation. Built for brand systems and production-ready workflows, it goes beyond prompt accuracy — delivering stronger composition, refined lighting, realistic materials, and a cohesive aesthetic. The result is imagery shaped by professional design judgment, ready for immediate real-world use without additional post-processing.

text-to-vector
Kling Kolors Virtual TryOn v1.5 is a high quality image based Try-On endpoint which can be used for commercial try on.
kling/v1-5/kolors-virtual-try-on
image-to-image

Kling Kolors Virtual TryOn v1.5 is a high quality image based Try-On endpoint which can be used for commercial try on.

try-on
fashion
clothing
Kokoro is a lightweight text-to-speech model that delivers comparable quality to larger models while being significantly faster and more cost-efficient.
kokoro/american-english
text-to-audio

Kokoro is a lightweight text-to-speech model that delivers comparable quality to larger models while being significantly faster and more cost-efficient.

speech
Kling's Native 4K is a video generation model that directly outputs professional-grade 4K video in one step, eliminating the need for post-production upscaling
kling-video/v3/4k/text-to-video
text-to-video

Kling's Native 4K is a video generation model that directly outputs professional-grade 4K video in one step, eliminating the need for post-production upscaling

stylized
transform
lipsync
Generate consistent character appearances across multiple images. Maintain facial features, proportions, and distinctive traits for cohesive storytelling and branding
ideogram/character
image-to-image

Generate consistent character appearances across multiple images. Maintain facial features, proportions, and distinctive traits for cohesive storytelling and branding

character-consistency
Transform images, elements, and text into consistent, high-quality video scenes, ensuring stable character identity, object details, and environments.
kling-video/o1/reference-to-video
image-to-video

Transform images, elements, and text into consistent, high-quality video scenes, ensuring stable character identity, object details, and environments.

Tuning-free ID customization.
pulid
image-to-image

Tuning-free ID customization.

editing
customization
personalization
Kling's Native 4K is a video generation model that directly outputs professional-grade 4K video in one step, eliminating the need for post-production upscaling
kling-video/o3/4k/reference-to-video
image-to-video

Kling's Native 4K is a video generation model that directly outputs professional-grade 4K video in one step, eliminating the need for post-production upscaling

stylized
transform
lipsync
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
florence-2-large/more-detailed-caption
vision

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

captioning
multimodal
LTX-2.3 is a high-quality, fast AI video model available in Pro and Fast variants for text-to-video, image-to-video, and audio-to-video.
ltx-2.3/text-to-video
text-to-video

LTX-2.3 is a high-quality, fast AI video model available in Pro and Fast variants for text-to-video, image-to-video, and audio-to-video.

stylized
transform
lipsync
Depth Anything v2 preprocessor.
image-preprocessors/depth-anything/v2
image-to-image

Depth Anything v2 preprocessor.

depth
preprocess
utility
Bring speech to your texts using Qwen3-TTS Custom-Voice model with pre-trained voices or use your custom voice with Qwen3-TTS Clone Voice model
qwen-3-tts/text-to-speech/1.7b
text-to-speech

Bring speech to your texts using Qwen3-TTS Custom-Voice model with pre-trained voices or use your custom voice with Qwen3-TTS Clone Voice model

GPT Image 1 mini combines OpenAI's advanced language capabilities, powered by GPT-5, with GPT Image 1 Mini for efficient image generation.
gpt-image-1-mini
text-to-image

GPT Image 1 mini combines OpenAI's advanced language capabilities, powered by GPT-5, with GPT Image 1 Mini for efficient image generation.

State of the art Image to 3D Object generation. Generate 3D model from a single image!
tripo3d/tripo/v2.5/image-to-3d
image-to-3d

State of the art Image to 3D Object generation. Generate 3D model from a single image!

stylized
EVF-SAM2 combines natural language understanding with advanced segmentation capabilities, allowing you to precisely mask image regions using intuitive positive and negative text prompts.
evf-sam
image-to-image

EVF-SAM2 combines natural language understanding with advanced segmentation capabilities, allowing you to precisely mask image regions using intuitive positive and negative text prompts.

segmentation
mask
Leverage the state-of-the-art capabilities of Hunyuan Image 3.0 to generate visual content that effectively conveys the messaging of your written material.
hunyuan-image/v3/text-to-image
text-to-image

Leverage the state-of-the-art capabilities of Hunyuan Image 3.0 to generate visual content that effectively conveys the messaging of your written material.

Generate text from speech using ElevenLabs advanced speech-to-text model.
elevenlabs/speech-to-text
speech-to-text

Generate text from speech using ElevenLabs advanced speech-to-text model.

speech
Fix distorted or blurred photos of people with CodeFormer.
codeformer
image-to-image

Fix distorted or blurred photos of people with CodeFormer.

image-restoration
faces
utility
Showing 281 to 308 of 1354 results