fal-ai/elevenlabs/tts/eleven-v3
Input
Customize your input with more control.
Result
What would you like to do next?
Your request will cost $0.1 per 1000 character.
Logs
ElevenLabs Text-to-Speech — Eleven v3 live on fal
Generate lifelike, expressive speech directly through fal's serverless infrastructure using Eleven v3, ElevenLabs' most advanced speech synthesis model. Eleven v3 delivers natural, emotionally-aware audio with broad dynamic range, inline audio tag control, and support for 70+ languages — all available through a single fal endpoint: `fal-ai/elevenlabs/tts/eleven-v3`.
What is Eleven v3?
Eleven v3 is ElevenLabs' flagship text-to-speech model, designed for the most expressive and emotionally nuanced AI-generated speech available today. Unlike earlier models, Eleven v3 offers a broad dynamic range controlled through inline audio tags — descriptive markers like `[whispers]`, `[laughs]`, `[excited]`, or `[sad]` that the model interprets to shape delivery, emotion, tempo, and tone.
Key capabilities of Eleven v3:
- 70+ languages supported, including English, Mandarin, Hindi, Arabic, Spanish, French, German, Japanese, Korean, and dozens more
- Audio tag control for fine-grained emotional direction (
`[laughs]`,`[whispers]`,`[sighs]`,`[excited]`,`[sarcastically]`, and many more) - Contextual emotional understanding — the model reads intent from descriptive cues like "she said excitedly" or exclamation marks
- Multi-speaker prosody matching when paired with the companion
`text-to-dialogue/eleven-v3`endpoint - Long-form narration quality suitable for audiobooks, documentaries, and feature-length productions
- Word-level timestamps available for subtitling, lip-sync, and alignment workflows
Eleven v3 is the clear choice when expressive delivery matters most. For ultra-low-latency conversational use cases, pair it with Flash or Turbo v2.5; for pure multilingual narration with classic stability, Multilingual v2 remains a solid alternative.
What is ElevenLabs?
ElevenLabs is an AI audio research and deployment company founded in 2022 by Piotr Dąbkowski (ex-Google ML engineer) and Mati Staniszewski (ex-Palantir). Headquartered in London and New York, the company has become the dominant platform in voice AI — according to recent market analysis, ElevenLabs accounts for approximately 98% of observed mid-market voice AI spend and is the entry point for 95% of first-time voice AI customers.
The company's platform spans text-to-speech, speech-to-text (Scribe), voice cloning (Instant and Professional), voice design, dubbing, music generation, and real-time conversational AI agents. ElevenLabs raised $500M at an $11B valuation in early 2026 and crossed $330M in annual recurring revenue in 2025.
Who uses ElevenLabs?
ElevenLabs is used by 41% of Fortune 500 companies and a diverse set of industry leaders:
- Media & Publishing: The Washington Post, TIME, HarperCollins
- Gaming: Paradox Interactive, Epic Games
- Technology & Platforms: Meta, Salesforce, Square, Revolut, IBM (via watsonx Orchestrate)
- Telecommunications: Deutsche Telekom, RingCentral
- Public Sector: The Ukrainian government
- Independent creators, podcasters, filmmakers, educators, and app developers
Enterprise customers choose ElevenLabs for SOC 2 Type II, ISO 27001, HIPAA, PCI DSS L1, and GDPR compliance, Zero Retention Mode, regional data residency (US, EU, India), and a library of 10,000+ voices.
Why use ElevenLabs on fal?
fal brings ElevenLabs' Eleven v3 model into a single unified serverless AI platform alongside hundreds of other generative AI models — image, video, audio, and LLMs. Running Eleven v3 on fal gives you:
- One API key, one client library for ElevenLabs plus every other model in your stack
- Serverless scaling with no infrastructure to manage
- Streaming support for real-time playback
- Queue-based async jobs with optional webhooks for long-running batches
- Commercial use rights included
- Simple, transparent pricing: $0.10 per 1,000 characters
Pricing
Pricing on fal is based on the length of the input text:
$0.10 per 1,000 characters
A 500-character paragraph costs $0.05. A 10,000-character short story costs $1.00. No seat licenses, no subscriptions, no minimums.
API Reference
Endpoint
POST https://fal.run/fal-ai/elevenlabs/tts/eleven-v3
Model ID: `fal-ai/elevenlabs/tts/eleven-v3`
Authentication
All requests require a fal API key. Set `FAL_KEY` as an environment variable in your runtime. Never expose your API key in client-side code — always proxy requests through a server you control.
bashexport FAL_KEY="your-fal-api-key"
Input Schema
| Parameter | Type | Required | Description |
|---|---|---|---|
`text` | string | ✅ | The text to convert to speech. Supports inline audio tags like `[laughs]`, `[whispers]`, `[excited]`. |
`voice` | string | — | Voice name or ID. Default: `"Rachel"`. Examples: `Aria`, `Roger`, `Sarah`, `Laura`, `Charlie`, `George`, `Callum`, `River`, `Liam`, `Charlotte`, `Alice`, `Matilda`, `Will`, `Jessica`, `Eric`, `Chris`, `Brian`, `Daniel`, `Lily`, `Bill`. |
`stability` | float | — | Voice stability, 0–1. Lower values = more expressive variation, higher = more consistent delivery. Default: `0.5`. |
`similarity_boost` | float | — | How closely the output matches the reference voice, 0–1. Default: `0.75`. |
`speed` | float | — | Playback speed multiplier. Default: `1`. |
`language_code` | string | — | ISO 639-1 language code to enforce a specific language. |
`apply_text_normalization` | enum | — | `"auto"`, `"on"`, or `"off"`. Controls spelling out of numbers, abbreviations, etc. Default: `"auto"`. |
`seed` | int | — | Random seed for reproducibility. |
`timestamps` | bool | — | When `true`, returns per-word timestamps in the response. |
`output_format` | enum | — | Codec, sample rate, and bitrate. Default: `"mp3_44100_128"`. |
Supported output formats
`mp3_22050_32`, `mp3_44100_32`, `mp3_44100_64`, `mp3_44100_96`, `mp3_44100_128` (default), `mp3_44100_192`, `pcm_8000`, `pcm_16000`, `pcm_22050`, `pcm_24000`, `pcm_44100`, `pcm_48000`, `ulaw_8000`, `alaw_8000`, `opus_48000_32`, `opus_48000_64`, `opus_48000_96`, `opus_48000_128`, `opus_48000_192`.
Output Schema
| Field | Type | Description |
|---|---|---|
`audio` | File | The generated audio file, returned as `{ "url": "https://v3.fal.media/..." }`. |
`timestamps` | array | Per-word timestamps. Only returned when `timestamps: true` in the request. |
Example Request
json{ "text": "Hello! [excited] This is a test of the text to speech system, powered by ElevenLabs. [whispers] How does it sound?", "voice": "Aria", "stability": 0.5, "similarity_boost": 0.75, "speed": 1, "apply_text_normalization": "auto" }
Example Response
json{ "audio": { "url": "https://v3.fal.media/files/zebra/zJL_oRY8h5RWwjoK1w7tx_output.mp3" } }
Code Examples
JavaScript / TypeScript
Install the client:
bashnpm install --save @fal-ai/client
Basic synchronous call:
javascriptimport { fal } from "@fal-ai/client"; const result = await fal.subscribe("fal-ai/elevenlabs/tts/eleven-v3", { input: { text: "Hello! This is a test of the text to speech system, powered by ElevenLabs. How does it sound?", voice: "Aria", stability: 0.5 }, logs: true, onQueueUpdate: (update) => { if (update.status === "IN_PROGRESS") { update.logs.map((log) => log.message).forEach(console.log); } }, }); console.log(result.data.audio.url); console.log(result.requestId);
Python
Install the client:
bashpip install fal-client
pythonimport fal_client def on_queue_update(update): if isinstance(update, fal_client.InProgress): for log in update.logs: print(log["message"]) result = fal_client.subscribe( "fal-ai/elevenlabs/tts/eleven-v3", arguments={ "text": "Hello! This is a test of the text to speech system, powered by ElevenLabs. How does it sound?", "voice": "Aria", "stability": 0.5, }, with_logs=True, on_queue_update=on_queue_update, ) print(result["audio"]["url"])
cURL
bashcurl --request POST \ --url https://fal.run/fal-ai/elevenlabs/tts/eleven-v3 \ --header "Authorization: Key $FAL_KEY" \ --header "Content-Type: application/json" \ --data '{ "text": "Hello! This is a test of the text to speech system, powered by ElevenLabs. How does it sound?", "voice": "Aria" }'
Streaming
Eleven v3 on fal supports streaming for real-time audio playback:
javascriptimport { fal } from "@fal-ai/client"; const stream = await fal.stream("fal-ai/elevenlabs/tts/eleven-v3", { input: { text: "Hello! This is a test of the text to speech system, powered by ElevenLabs." } }); for await (const event of stream) { console.log(event); } const result = await stream.done();
Async Queue + Webhooks
For long-form generation (audiobooks, podcasts, large batches), use the queue API with an optional webhook to receive completion notifications:
javascriptimport { fal } from "@fal-ai/client"; // Submit const { request_id } = await fal.queue.submit("fal-ai/elevenlabs/tts/eleven-v3", { input: { text: "Long-form narration goes here..." }, webhookUrl: "https://your-app.com/webhooks/tts-complete", }); // Check status later const status = await fal.queue.status("fal-ai/elevenlabs/tts/eleven-v3", { requestId: request_id, logs: true, }); // Fetch the result const result = await fal.queue.result("fal-ai/elevenlabs/tts/eleven-v3", { requestId: request_id });
Audio Tags
Eleven v3's standout feature is inline audio tag control. Add bracketed directives anywhere in your text to shape how it's performed:
Emotional tags
`[happy]` `[sad]` `[excited]` `[angry]` `[sarcastically]` `[nervous]` `[confident]`
Delivery tags
`[whispers]` `[shouting]` `[slowly]` `[quickly]` `[softly]`
Non-verbal sounds
`[laughs]` `[chuckles]` `[sighs]` `[gasps]` `[coughs]` `[gulps]` `[applause]`
Accent tags
`[strong canadian accent]` `[british accent]` `[southern accent]`
Example:
text[slowly] Back then... [chuckles] we had no phones. [whispers] Just dirt roads and [coughs] big dreams. [sad] Then it happened.
Audio tags are voice- and context-dependent — some voices interpret certain tags more reliably than others. Experiment to find what works best for your use case.
Use Cases
Eleven v3's expressive range and multilingual reach make it well-suited for high-production-value audio work:
Content & Media
- Audiobooks & narration — long-form storytelling with emotional depth across chapters
- Podcasts — professional voiceover, ad reads, and show intros
- Video voiceover — YouTube, TikTok, corporate video, documentary narration
- Dubbing & localization — translate and voice content across 70+ languages while preserving emotional delivery
Gaming & Interactive
- Character voicing — generate dozens of distinct NPC voices without a voice-acting studio
- Dynamic dialogue — react to in-game state with on-the-fly expressive lines
- VR / XR experiences — immersive spatial audio with emotionally responsive characters
Enterprise & Agents
- Customer support voice agents — natural-sounding IVR and contact center automation
- Sales and outbound — personalized outreach at scale
- Training & onboarding — internal learning content narrated at production quality
- Accessibility — screen readers and assistive reading experiences with natural intonation
Product & Developer Experiences
- In-app voice features — notifications, summaries, news briefings
- AI assistants & companions — emotionally responsive conversational UI
- Accessibility tooling — reading apps, PDF-to-audio, article narration
Related Models on fal
If Eleven v3 isn't the right fit, fal also hosts other ElevenLabs endpoints:
`fal-ai/elevenlabs/text-to-dialogue/eleven-v3`— multi-speaker dialogue generation with matched prosody`fal-ai/elevenlabs/tts/multilingual-v2`— stable, high-quality narration across 29 languages`fal-ai/elevenlabs/tts/turbo-v2.5`— lowest-latency option, ideal for real-time agents