fal-ai/stable-audio-3/medium/audio-to-audio

Stable Audio 3 Medium audio-to-audio is a 1.4 billion parameter latent diffusion model that transforms an input audio clip into new stereo variations up to 6 minutes guided by a text prompt.
Inference
Commercial use

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Your request will cost $0.0417 per audio.

Logs