Audio API

Overview

Models for audio processing including text-to-speech, speech-to-text, voice cloning, and music generation.

Top Models

Whisper API

Whisper is a model for speech transcription and translation.

Kling Video Create Voice API

Create Voices to be used with Kling Models Voice Control

Example output from Kling Video Create Voice

xAI Text to Speech API

Generate speech with expressive and realistic voices from xAI

Explore all audio models on fal.ai/models.

Quick Start

Get started with Whisper:

import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
           print(log["message"])

result = fal_client.subscribe(
    "fal-ai/whisper",
    arguments={
        "audio_url": "https://storage.googleapis.com/falserverless/model_tests/whisper/dinner_conversation.mp3"
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)
print(result)

Pricing

For detailed pricing information, see the fal.ai pricing page or individual model pages.

Kling Video Create VoiceAPI reference for Kling Video Create Voice. Create Voices to be used with Kling Models Voice Control

⌘I

​Overview

​Top Models