fal-ai/stable-audio-3/medium/audio-to-audio

Stable Audio 3 Medium audio-to-audio is a 1.4 billion parameter latent diffusion model that transforms an input audio clip into new stereo variations up to 6 minutes guided by a text prompt.
Inference
Commercial use

Prompt examples

Examples are generated using the Stable Audio 3 Medium Audio to Audio. You can customize them by clicking on the "Playground" button.

arcade funk slap bass sparkle
num_inference_steps8
guidance_scale1
seed730910
Playground
Transform the source into bright arcade funk instrumental with slap bass, talkbox-style synth lead, and tight disco claps; preserve the main rhythmic contour while changing instrumentation, space, and groove. No vocals.
num_inference_steps8
guidance_scale1
seed730910
Playground