fal-ai/echomimic-v3

EchoMimic V3 generates a talking avatar model from a picture, audio and text prompt.

Inference

Commercial use

Input

Image URL*

Hint: Drag and drop image files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: jpg, jpeg, png, webp, gif, avif

Audio URL*

Hint: Drag and drop audio files from your computer, audio from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: mp3, ogg, wav, m4a, aac

Prompt*

Type # to reference inputs.

Additional Settings

Customize your input with more control.

Result

Idle

Waiting for your input...

What would you like to do next?

Your request will cost $0.20 per generated second of video, based on the length of the input audio. For example, a 5-second video will cost $1.00.

fal-ai/echomimic-v3

Input

Result

What would you like to do next?

Logs