Search Page 18

Showing 28 of 1396 results

Generate music from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality, diverse musical compositions.

music

text-to-audio

Precise camera position and angle control (rotation, zoom, vertical movement)

qwen-image-edit-plus-lora-gallery/multiple-angles

Precise camera position and angle control (rotation, zoom, vertical movement)

stylized

transform

image-to-image

Heygen Translate Model with Extreme Precision

heygen/v2/translate/precision

Heygen Translate Model with Extreme Precision

video-to-video

Extend Veo-Created Videos up to 30 seconds

veo3.1/fast/extend-video

Extend Veo-Created Videos up to 30 seconds

extend-video

video-to-video

hidream-o1-image

Unified image generation with HiDream-O1-Image. Create, edit, and personalize high-resolution images up to 2K—single native model handles text-to-image, editing, and custom subjects without external components.

text-to-image

musetalk

MuseTalk is a real-time high quality audio-driven lip-syncing model. Use MuseTalk to animate a face with your own audio.

image-editing/reframe

The reframe endpoint intelligently adjusts an image's aspect ratio while preserving the main subject's position, composition, pose, and perspective

Create high-fidelity video with audio from images with LTX-2 Pro

image-to-video

retoucher

Automatically retouches faces to smooth skin and remove blemishes.

editing

image-to-image

moondream2/object-detection

Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint.

image-to-image

vision

Generate high-quality video with audio from images using LTX-2.3

ltx-2.3-quality/image-to-video

Generate high-quality video with audio from images using LTX-2.3

image-to-video

LoRA inference endpoint for Qwen Image 2512, an improved version of Qwen Image with better text rendering, finer natural textures, and more realistic human generation.

qwen-image-2512/lora

LoRA inference endpoint for Qwen Image 2512, an improved version of Qwen Image with better text rendering, finer natural textures, and more realistic human generation.

ltx-2.3/audio-to-video

LTX-2.3 is a high-quality, fast AI video model available in Pro and Fast variants for text-to-video, image-to-video, and audio-to-video.

minimax/hailuo-2.3/pro/text-to-video

MiniMax Hailuo-2.3 Text To Video API (Pro, 1080p): Advanced text-to-video generation model with 1080p resolution

text-to-video

Generate video with audio from images using LTX-2

ltx-2-19b/image-to-video

Generate video with audio from images using LTX-2

image-to-video

florence-2-large/caption-to-phrase-grounding

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

multimodal

vision

image-to-image

Edit an existing video using natural-language instructions, transforming subjects, settings, and style while retaining the original motion structure.

kling-video/o1/standard/video-to-video/edit

Edit an existing video using natural-language instructions, transforming subjects, settings, and style while retaining the original motion structure.

video-to-video

drct-super-resolution

Upscale your images with DRCT-Super-Resolution.

upscaling

high-res

image-to-image

Generate video clips from your prompts using Kling 1.0

kling-video/v1/standard/text-to-video

Generate video clips from your prompts using Kling 1.0

motion

text-to-video

sam-3/video

SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.

wan/v2.6/text-to-image

Wan 2.6 text-to-image model.

text-to-image

ben/v2/video

A model for high quality and smooth background removal for videos.

segmentation

background removal

video-to-video

recraft/v4/pro/text-to-vector

Recraft V4 was developed with designers to bring true visual taste to AI image generation. Built for brand systems and production-ready workflows, it goes beyond prompt accuracy — delivering stronger composition, refined lighting, realistic materials, and a cohesive aesthetic. The result is imagery shaped by professional design judgment, ready for immediate real-world use without additional post-processing.

text-to-vector

text-to-image

hy-wu-edit

Image editing with HY-WU. Transfer outfits, swap faces, and blend textures instantly—no finetuning needed, just describe what you want and provide reference images.

image-to-image

rife/video

Interpolate videos with RIFE - Real-Time Intermediate Flow Estimation

interpolation

video-to-video

sam-3/3d-body

SAM 3D allows for accurate 3D reconstruction of human body shape and position from a single image.

human

pose

image-to-3d

heygen/v3/lipsync/precision

Replace or dub audio on an existing video with high-accuracy avatar-inference lip-sync.

flux-pro/v1.1-ultra/redux

FLUX1.1 [pro] ultra Redux is a high-performance endpoint for the FLUX1.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.

style transfer

high-res

image-to-image

Showing 477 to 504 of 1396 results