Elevenlabs Tts Eleven V3 (Text to Audio) API on fal

ElevenLabs Text-to-Speech — Eleven v3 live on fal

Generate lifelike, expressive speech directly through fal's serverless infrastructure using Eleven v3, ElevenLabs' most advanced speech synthesis model. Eleven v3 delivers natural, emotionally-aware audio with broad dynamic range, inline audio tag control, and support for 70+ languages — all available through a single fal endpoint: `fal-ai/elevenlabs/tts/eleven-v3`.

What is Eleven v3?

Eleven v3 is ElevenLabs' flagship text-to-speech model, designed for the most expressive and emotionally nuanced AI-generated speech available today. Unlike earlier models, Eleven v3 offers a broad dynamic range controlled through inline audio tags — descriptive markers like `[whispers]`, `[laughs]`, `[excited]`, or `[sad]` that the model interprets to shape delivery, emotion, tempo, and tone.

Key capabilities of Eleven v3:

70+ languages supported, including English, Mandarin, Hindi, Arabic, Spanish, French, German, Japanese, Korean, and dozens more
Audio tag control for fine-grained emotional direction (`[laughs]`, `[whispers]`, `[sighs]`, `[excited]`, `[sarcastically]`, and many more)
Contextual emotional understanding — the model reads intent from descriptive cues like "she said excitedly" or exclamation marks
Multi-speaker prosody matching when paired with the companion `text-to-dialogue/eleven-v3` endpoint
Long-form narration quality suitable for audiobooks, documentaries, and feature-length productions
Word-level timestamps available for subtitling, lip-sync, and alignment workflows

Eleven v3 is the clear choice when expressive delivery matters most. For ultra-low-latency conversational use cases, pair it with Flash or Turbo v2.5; for pure multilingual narration with classic stability, Multilingual v2 remains a solid alternative.

What is ElevenLabs?

ElevenLabs is an AI audio research and deployment company founded in 2022 by Piotr Dąbkowski (ex-Google ML engineer) and Mati Staniszewski (ex-Palantir). Headquartered in London and New York, the company has become the dominant platform in voice AI — according to recent market analysis, ElevenLabs accounts for approximately 98% of observed mid-market voice AI spend and is the entry point for 95% of first-time voice AI customers.

The company's platform spans text-to-speech, speech-to-text (Scribe), voice cloning (Instant and Professional), voice design, dubbing, music generation, and real-time conversational AI agents. ElevenLabs raised $500M at an $11B valuation in early 2026 and crossed $330M in annual recurring revenue in 2025.

Who uses ElevenLabs?

ElevenLabs is used by 41% of Fortune 500 companies and a diverse set of industry leaders:

Media & Publishing: The Washington Post, TIME, HarperCollins
Gaming: Paradox Interactive, Epic Games
Technology & Platforms: Meta, Salesforce, Square, Revolut, IBM (via watsonx Orchestrate)
Telecommunications: Deutsche Telekom, RingCentral
Public Sector: The Ukrainian government
Independent creators, podcasters, filmmakers, educators, and app developers

Enterprise customers choose ElevenLabs for SOC 2 Type II, ISO 27001, HIPAA, PCI DSS L1, and GDPR compliance, Zero Retention Mode, regional data residency (US, EU, India), and a library of 10,000+ voices.

Why use ElevenLabs on fal?

fal brings ElevenLabs' Eleven v3 model into a single unified serverless AI platform alongside hundreds of other generative AI models — image, video, audio, and LLMs. Running Eleven v3 on fal gives you:

One API key, one client library for ElevenLabs plus every other model in your stack
Serverless scaling with no infrastructure to manage
Streaming support for real-time playback
Queue-based async jobs with optional webhooks for long-running batches
Commercial use rights included
Simple, transparent pricing: $0.10 per 1,000 characters

Pricing

Pricing on fal is based on the length of the input text:

$0.10 per 1,000 characters

A 500-character paragraph costs $0.05. A 10,000-character short story costs $1.00. No seat licenses, no subscriptions, no minimums.

API Reference

Endpoint


POST https://fal.run/fal-ai/elevenlabs/tts/eleven-v3

Model ID: `fal-ai/elevenlabs/tts/eleven-v3`

Authentication

All requests require a fal API key. Set `FAL_KEY` as an environment variable in your runtime. Never expose your API key in client-side code — always proxy requests through a server you control.

bash
export FAL_KEY="your-fal-api-key"

Input Schema

Parameter	Type	Required	Description
`text`	string	✅	The text to convert to speech. Supports inline audio tags like `[laughs]`, `[whispers]`, `[excited]`.
`voice`	string	—	Voice name or ID. Default: `"Rachel"`. Examples: `Aria`, `Roger`, `Sarah`, `Laura`, `Charlie`, `George`, `Callum`, `River`, `Liam`, `Charlotte`, `Alice`, `Matilda`, `Will`, `Jessica`, `Eric`, `Chris`, `Brian`, `Daniel`, `Lily`, `Bill`.
`stability`	float	—	Voice stability, 0–1. Lower values = more expressive variation, higher = more consistent delivery. Default: `0.5`.
`similarity_boost`	float	—	How closely the output matches the reference voice, 0–1. Default: `0.75`.
`speed`	float	—	Playback speed multiplier. Default: `1`.
`language_code`	string	—	ISO 639-1 language code to enforce a specific language.
`apply_text_normalization`	enum	—	`"auto"`, `"on"`, or `"off"`. Controls spelling out of numbers, abbreviations, etc. Default: `"auto"`.
`seed`	int	—	Random seed for reproducibility.
`timestamps`	bool	—	When `true`, returns per-word timestamps in the response.
`output_format`	enum	—	Codec, sample rate, and bitrate. Default: `"mp3_44100_128"`.

Supported output formats

`mp3_22050_32`, `mp3_44100_32`, `mp3_44100_64`, `mp3_44100_96`, `mp3_44100_128` (default), `mp3_44100_192`, `pcm_8000`, `pcm_16000`, `pcm_22050`, `pcm_24000`, `pcm_44100`, `pcm_48000`, `ulaw_8000`, `alaw_8000`, `opus_48000_32`, `opus_48000_64`, `opus_48000_96`, `opus_48000_128`, `opus_48000_192`.

Output Schema

Field	Type	Description
`audio`	File	The generated audio file, returned as `{ "url": "https://v3.fal.media/..." }`.
`timestamps`	array	Per-word timestamps. Only returned when `timestamps: true` in the request.

Example Request

json
{
  "text": "Hello! [excited] This is a test of the text to speech system, powered by ElevenLabs. [whispers] How does it sound?",
  "voice": "Aria",
  "stability": 0.5,
  "similarity_boost": 0.75,
  "speed": 1,
  "apply_text_normalization": "auto"
}

Example Response

json
{
  "audio": {
    "url": "https://v3.fal.media/files/zebra/zJL_oRY8h5RWwjoK1w7tx_output.mp3"
  }
}

Code Examples

JavaScript / TypeScript

Install the client:

bash
npm install --save @fal-ai/client

Basic synchronous call:

javascript
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/elevenlabs/tts/eleven-v3", {
  input: {
    text: "Hello! This is a test of the text to speech system, powered by ElevenLabs. How does it sound?",
    voice: "Aria",
    stability: 0.5
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});

console.log(result.data.audio.url);
console.log(result.requestId);

Python

Install the client:

bash
pip install fal-client

python
import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
            print(log["message"])

result = fal_client.subscribe(
    "fal-ai/elevenlabs/tts/eleven-v3",
    arguments={
        "text": "Hello! This is a test of the text to speech system, powered by ElevenLabs. How does it sound?",
        "voice": "Aria",
        "stability": 0.5,
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)

print(result["audio"]["url"])

cURL

bash
curl --request POST \
  --url https://fal.run/fal-ai/elevenlabs/tts/eleven-v3 \
  --header "Authorization: Key $FAL_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "text": "Hello! This is a test of the text to speech system, powered by ElevenLabs. How does it sound?",
    "voice": "Aria"
  }'

Streaming

Eleven v3 on fal supports streaming for real-time audio playback:

javascript
import { fal } from "@fal-ai/client";

const stream = await fal.stream("fal-ai/elevenlabs/tts/eleven-v3", {
  input: {
    text: "Hello! This is a test of the text to speech system, powered by ElevenLabs."
  }
});

for await (const event of stream) {
  console.log(event);
}

const result = await stream.done();

Async Queue + Webhooks

For long-form generation (audiobooks, podcasts, large batches), use the queue API with an optional webhook to receive completion notifications:

javascript
import { fal } from "@fal-ai/client";

// Submit
const { request_id } = await fal.queue.submit("fal-ai/elevenlabs/tts/eleven-v3", {
  input: {
    text: "Long-form narration goes here..."
  },
  webhookUrl: "https://your-app.com/webhooks/tts-complete",
});

// Check status later
const status = await fal.queue.status("fal-ai/elevenlabs/tts/eleven-v3", {
  requestId: request_id,
  logs: true,
});

// Fetch the result
const result = await fal.queue.result("fal-ai/elevenlabs/tts/eleven-v3", {
  requestId: request_id
});

Audio Tags

Eleven v3's standout feature is inline audio tag control. Add bracketed directives anywhere in your text to shape how it's performed:

Emotional tags `[happy]` `[sad]` `[excited]` `[angry]` `[sarcastically]` `[nervous]` `[confident]`

Delivery tags `[whispers]` `[shouting]` `[slowly]` `[quickly]` `[softly]`

Non-verbal sounds `[laughs]` `[chuckles]` `[sighs]` `[gasps]` `[coughs]` `[gulps]` `[applause]`

Accent tags `[strong canadian accent]` `[british accent]` `[southern accent]`

Example:

text
[slowly] Back then... [chuckles] we had no phones. [whispers] Just dirt roads and [coughs] big dreams. [sad] Then it happened.

Audio tags are voice- and context-dependent — some voices interpret certain tags more reliably than others. Experiment to find what works best for your use case.

Use Cases

Eleven v3's expressive range and multilingual reach make it well-suited for high-production-value audio work:

Content & Media

Audiobooks & narration — long-form storytelling with emotional depth across chapters
Podcasts — professional voiceover, ad reads, and show intros
Video voiceover — YouTube, TikTok, corporate video, documentary narration
Dubbing & localization — translate and voice content across 70+ languages while preserving emotional delivery

Gaming & Interactive

Character voicing — generate dozens of distinct NPC voices without a voice-acting studio
Dynamic dialogue — react to in-game state with on-the-fly expressive lines
VR / XR experiences — immersive spatial audio with emotionally responsive characters

Enterprise & Agents

Customer support voice agents — natural-sounding IVR and contact center automation
Sales and outbound — personalized outreach at scale
Training & onboarding — internal learning content narrated at production quality
Accessibility — screen readers and assistive reading experiences with natural intonation

Product & Developer Experiences

In-app voice features — notifications, summaries, news briefings
AI assistants & companions — emotionally responsive conversational UI
Accessibility tooling — reading apps, PDF-to-audio, article narration

If Eleven v3 isn't the right fit, fal also hosts other ElevenLabs endpoints:

`fal-ai/elevenlabs/text-to-dialogue/eleven-v3` — multi-speaker dialogue generation with matched prosody
`fal-ai/elevenlabs/tts/multilingual-v2` — stable, high-quality narration across 29 languages
`fal-ai/elevenlabs/tts/turbo-v2.5` — lowest-latency option, ideal for real-time agents

fal-ai/elevenlabs/tts/eleven-v3

Input

Result

What would you like to do next?

Logs

ElevenLabs Text-to-Speech — Eleven v3 live on fal

What is Eleven v3?

What is ElevenLabs?

Who uses ElevenLabs?

Why use ElevenLabs on fal?

Pricing

API Reference

Endpoint

Authentication

Input Schema

Supported output formats

Output Schema

Example Request

Example Response

Code Examples

JavaScript / TypeScript

Python

cURL

Streaming

Async Queue + Webhooks

Audio Tags

Use Cases

Content & Media

Gaming & Interactive

Enterprise & Agents

Product & Developer Experiences

Further Reading

fal-ai/elevenlabs/tts/eleven-v3

Input

Result

What would you like to do next?

Logs

ElevenLabs Text-to-Speech — Eleven v3 live on fal

What is Eleven v3?

What is ElevenLabs?

Who uses ElevenLabs?

Why use ElevenLabs on fal?

Pricing

API Reference

Endpoint

Authentication

Input Schema

Supported output formats

Output Schema

Example Request

Example Response

Code Examples

JavaScript / TypeScript

Python

cURL

Streaming

Async Queue + Webhooks

Audio Tags

Use Cases

Content & Media

Gaming & Interactive

Enterprise & Agents

Product & Developer Experiences

Related Models on fal

Further Reading