FLUX.2 is now live!

EchoMimic V3 Audio to Video

fal-ai/echomimic-v3
EchoMimic V3 generates a talking avatar model from a picture, audio and text prompt.
Inference
Commercial use

Input

Additional Settings

Customize your input with more control.

Result

Idle

Waiting for your input...

What would you like to do next?

Your request will cost $0.20 per generated second of video, based on the length of the input audio. For example, a 5-second video will cost $1.00.

Logs