MiniMax Speech-02 HD Text to Speech
Input
Customize your input with more control.
Result
Your request will cost $0.1 per 1000 character.
Logs
Readme
MiniMax TTS: Professional Text-to-Speech API
Transform your text into natural-sounding speech with MiniMax TTS, a powerful text-to-speech API designed for developers who need high-quality voice synthesis in their applications.
Overview
MiniMax TTS delivers studio-quality voice synthesis with advanced neural networks, offering natural prosody, clear articulation, and emotion-aware speech generation. Perfect for applications ranging from audiobook production to accessible content creation.
Key Capabilities
Generate lifelike speech with precise control over:
- Voice characteristics and speaking style
 - Speech pacing and emotional tone
 - Multiple language support with natural accents (30+ languages)
 - Real-time speech synthesis for interactive applications
 - Over 300+ authentic voices
 
Getting Started
Getting up and running with MiniMax TTS takes just a few minutes. Here's how to begin:
- Install the client library:
 
bash# Using npm npm install --save @fal-ai/client # Using pip pip install fal-client
- Configure your API key:
 
javascriptimport { fal } from "@fal-ai/client"; fal.config({ credentials: "YOUR_FAL_KEY_HERE" });
- Make your first API call:
 
javascriptconst result = await fal.subscribe("fal-ai/minimax/speech-02-hd", { input: { text: "Welcome to MiniMax TTS. This is a demonstration of our natural speech synthesis." } });
Technical Integration
MiniMax TTS supports both synchronous and queue-based processing. The API accepts plain text input and returns audio in multiple formats including MP3, WAV, FLAC, and PCM.
Advanced Usage
Control fine-grained speech parameters:
javascriptconst response = await fal.subscribe("fal-ai/minimax/speech-02-hd", { input: { text: "This is an important announcement.", voice_setting: { voice_id: "Wise_Woman", speed: 1.0, vol: 1.0, pitch: 0, english_normalization: false }, output_format: "mp3" } });
Queue-Based Processing
For handling multiple requests or asynchronous workflows:
javascript// Submit request const { request_id } = await fal.queue.submit("fal-ai/minimax/speech-02-hd", { input: { text: "Your text here" }, webhookUrl: "https://optional.webhook.url/for/results" }); // Check status const status = await fal.queue.status("fal-ai/minimax/speech-02-hd", { requestId: request_id, logs: true }); // Get result const result = await fal.queue.result("fal-ai/minimax/speech-02-hd", { requestId: request_id });
Voice Settings
The API provides extensive control over voice parameters:
javascript{ voice_setting: { voice_id: string, // One of 300+ available voices speed: number, // Speed control (default: 1.0) vol: number, // Volume control (default: 1.0) pitch: number, // Pitch adjustment (default: 0) english_normalization: boolean // Improves number reading } }
Supported Languages
MiniMax TTS supports 30+ languages including:
- Chinese, Chinese (Yue/Cantonese)
 - English, Spanish, French, German, Italian, Portuguese
 - Japanese, Korean
 - Arabic, Russian, Turkish, Dutch, Ukrainian
 - Vietnamese, Indonesian, Thai, Hindi
 - Polish, Romanian, Greek, Czech, Finnish
 
Output Formats
Available audio output formats:
- MP3 - Default format, good compression
 - WAV - Uncompressed, high quality
 - FLAC - Lossless compression
 - PCM - Raw audio data
 
Best Practices
For optimal results with MiniMax TTS:
- Include proper punctuation in your input text for natural pausing and intonation
 - Use the 
`english_normalization`flag for better number reading performance - Process up to 5,000 characters in real-time or 1 million characters asynchronously
 - Cache frequently used audio outputs to optimize performance and costs
 
Error Handling
Implement robust error handling to manage API responses:
javascripttry { const result = await fal.subscribe("fal-ai/minimax/speech-02-hd", { input: { text: inputText } }); } catch (error) { console.error("Speech generation failed:", error.message); // Implement fallback behavior }
Available Models
MiniMax offers multiple TTS models:
- speech-02-hd: High-definition quality, best for production use
 - speech-02-turbo: Optimized for real-time applications with low latency
 - speech-01-hd: Previous generation HD model
 - speech-01-turbo: Previous generation turbo model
 
Performance and Scaling
MiniMax TTS is built for production workloads with:
- Low latency response times for real-time applications
 - High-throughput capability for batch processing
 - Automatic scaling to handle varying loads
 - Global CDN distribution for consistent performance
 
Pricing and Usage
- Cost: $0.05 per 1000 characters
 - Transparent, usage-based pricing
 - No subscription necessary
 - No hidden fees or minimum commitments
 
View detailed pricing or contact sales for enterprise solutions.
Support Resources
We're here to help you succeed:
- API Documentation
 - General fal.ai Documentation
 - Technical support via email
 - Regular updates and improvements
 
Start building with MiniMax TTS today and bring natural speech to your applications.