fal-ai/ai-avatar

MultiTalk model generates a talking avatar video from an image and audio file. The avatar lip-syncs to the provided audio with natural facial expressions.

Inference

Commercial use

This endpoint is deprecated

This model is no longer supported.

Schema

LLMs

Playground API

Input

Image URL*

Hint: Drag and drop image files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: jpg, jpeg, png, webp, gif, avif

Audio URL*

Hint: Drag and drop audio files from your computer, audio from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: mp3, ogg, wav, m4a, aac

Prompt*

Type # to reference inputs.

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Download

{
  "video": {
    "url": "https://v3.fal.media/files/kangaroo/z6VqUwNTwzuWa6YE1g7In_74af6c0bdd6041c3b1130d54885e3eee.mp4",
    "file_size": 515275,
    "file_name": "74af6c0bdd6041c3b1130d54885e3eee.mp4",
    "content_type": "application/octet-stream"
  }
}

Your request will cost $0.2 per second.

For 720p price will be doubled.

Logs

Ai Avatar (Image to Video) API on fal