Model Gallery
Featured Models
Check out some of our most popular models
LatentSync is a video-to-video model that generates lip sync animations from audio using advanced algorithms for high-quality synchronization.
Generate music from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality, diverse musical compositions.
Search Results
12 models found
MMAudio generates synchronized audio given video and/or text inputs. It can be combined with video models to get videos with audio.
MMAudio generates synchronized audio given text inputs. It can generate sounds described by a prompt.
Open source text-to-audio model.
Blazing-fast text-to-speech. Generate audio with improved emotional tones and extensive multilingual support. Ideal for high-volume processing and efficient workflows.
Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization.
Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
MuseTalk is a real-time high quality audio-driven lip-syncing model. Use MuseTalk to animate a face with your own audio.
Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Automatically generates text captions for your videos from the audio as per text colour/font specifications
F5 TTS