Search Page 16

Showing 28 of 1395 results

Generate high quality video clips from text and image prompts using PixVerse v4.5

pixverse/v4.5/image-to-video

Generate high quality video clips from text and image prompts using PixVerse v4.5

stylized

transform

image-to-video

image-apps-v2/virtual-try-on

Try on clothes virtually by combining person and clothing images.

flux-2-trainer

Fine-tune FLUX.2 [dev] from Black Forest Labs with custom datasets. Create specialized LoRA adaptations for specific styles and domains.

training

ddcolor

Bring colors into old or new black and white photos with DDColor.

mmaudio-v2/text-to-audio

MMAudio generates synchronized audio given text inputs. It can generate sounds described by a prompt.

audio

fast

text-to-audio

stable-diffusion-v35-large

Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.

FASHN v1.5 delivers precise virtual try-on capabilities, accurately rendering garment details like text and patterns at 576x864 resolution from both on-model and flat-lay photo references.

wan/v2.7/pro/text-to-image

Generate premium-quality images from text prompts using the enhanced WAN 2.7 Pro model with superior detail and composition.

wan

pro

text-to-image

Wan 2.2's 5B model produces up to 5 seconds of video 720p at 24FPS with fluid motion and powerful prompt understanding

wan/v2.2-5b/image-to-video

Wan 2.2's 5B model produces up to 5 seconds of video 720p at 24FPS with fluid motion and powerful prompt understanding

image-to-video

Isolate audio tracks using ElevenLabs advanced audio isolation technology.

elevenlabs/audio-isolation

Isolate audio tracks using ElevenLabs advanced audio isolation technology.

audio

audio-to-audio

Extend Veo-Created Videos up to 30 seconds

veo3.1/extend-video

Extend Veo-Created Videos up to 30 seconds

extend-video

video-to-video

Wan-2.1 is a text-to-video model that generates high-quality videos with high visual quality and motion diversity from text prompts

wan-t2v

Wan-2.1 is a text-to-video model that generates high-quality videos with high visual quality and motion diversity from text prompts

text to video

motion

text-to-video

Image-to-image editing with FLUX.2 [klein] 9B from Black Forest Labs and custom LoRA. Precise modifications using natural language descriptions and hex color control.

flux-2/klein/9b/edit/lora

Image-to-image editing with FLUX.2 [klein] 9B from Black Forest Labs and custom LoRA. Precise modifications using natural language descriptions and hex color control.

image-to-image

tripo3d/triposplat

TripoSplat is an open-source model from TripoAI / VAST AI Research that converts a single 2D image into high-quality 3D Gaussians using a novel learned density-control approach

gaussian-splat

image-to-3d

Create seamless transition between images using PixVerse v5

pixverse/v5/transition

Create seamless transition between images using PixVerse v5

stylized

transform

image-to-video

moondream3-preview/detect

Moondream 3 is a vision language model that brings frontier-level visual reasoning with native object detection, pointing, and OCR capabilities to real-world applications requiring fast, inexpensive inference at scale.

vision

meshy/rigging/multi-animation

Meshy auto-rigs a humanoid 3D model fitting a skeleton and binding the mesh, then applies several motion presets from its animation library

stylized

transform

3d-to-3d

Generate long videos from prompts and images using LTX Video-0.9.8 13B Distilled and custom LoRA

ltxv-13b-098-distilled/image-to-video

Generate long videos from prompts and images using LTX Video-0.9.8 13B Distilled and custom LoRA

video

ltx-video

image-to-video

Image-to-image editing with FLUX.2 [klein] 4B Base from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.

flux-2/klein/4b/base/edit

Image-to-image editing with FLUX.2 [klein] 4B Base from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.

image-to-image

Luma Uni-1 turns a text prompt into a single high-fidelity image, with control over aspect ratio and visual style, plus optional web-sourced and reference-image guidance for sharper grounding.

luma/agent/uni-1/v1/text-to-image

Luma Uni-1 turns a text prompt into a single high-fidelity image, with control over aspect ratio and visual style, plus optional web-sourced and reference-image guidance for sharper grounding.

wan/v2.2-a14b/image-to-video/lora

Wan-2.2 image-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts and images. This endpoint supports LoRAs made for Wan 2.2

motion

lora

image-to-video

smoretalk-ai/rembg-enhance

Rembg-enhance is optimized for 2D vector images, 3D graphics, and photos by leveraging matting technology.

Generate 3D models from multiple images using Trellis. A native 3D generative model enabling versatile and high-quality 3D asset creation.

stylized

image-to-3d

Wan-2.1 flf2v generates dynamic videos by intelligently bridging a given first frame to a desired end frame through smooth, coherent motion sequences.

wan-flf2v

Wan-2.1 flf2v generates dynamic videos by intelligently bridging a given first frame to a desired end frame through smooth, coherent motion sequences.

image to video

motion

image-to-video