MiniMax Speech-02 HD Text to Speech
Input
Customize your input with more control.
Result
Your request will cost $0.1 per 1000 character.
Logs
Readme
MiniMax TTS: Professional Text-to-Speech API
Transform your text into natural-sounding speech with MiniMax TTS, a powerful text-to-speech API designed for developers who need high-quality voice synthesis in their applications.
Overview
MiniMax TTS delivers studio-quality voice synthesis with advanced neural networks, offering natural prosody, clear articulation, and emotion-aware speech generation. Perfect for applications ranging from audiobook production to accessible content creation.
Key Capabilities
Generate lifelike speech with precise control over:
- Voice characteristics and speaking style
- Speech pacing and emotional tone
- Multiple language support with natural accents (30+ languages)
- Real-time speech synthesis for interactive applications
- Over 300+ authentic voices
Getting Started
Getting up and running with MiniMax TTS takes just a few minutes. Here's how to begin:
- Install the client library:
bash
# Using npm npm install --save @fal-ai/client # Using pip pip install fal-client
- Configure your API key:
javascript
import { fal } from "@fal-ai/client"; fal.config({ credentials: "YOUR_FAL_KEY_HERE" });
- Make your first API call:
javascript
const result = await fal.subscribe("fal-ai/minimax/speech-02-hd", { input: { text: "Welcome to MiniMax TTS. This is a demonstration of our natural speech synthesis." } });
Technical Integration
MiniMax TTS supports both synchronous and queue-based processing. The API accepts plain text input and returns audio in multiple formats including MP3, WAV, FLAC, and PCM.
Advanced Usage
Control fine-grained speech parameters:
javascript
const response = await fal.subscribe("fal-ai/minimax/speech-02-hd", { input: { text: "This is an important announcement.", voice_setting: { voice_id: "Wise_Woman", speed: 1.0, vol: 1.0, pitch: 0, english_normalization: false }, output_format: "mp3" } });
Queue-Based Processing
For handling multiple requests or asynchronous workflows:
javascript
// Submit request const { request_id } = await fal.queue.submit("fal-ai/minimax/speech-02-hd", { input: { text: "Your text here" }, webhookUrl: "https://optional.webhook.url/for/results" }); // Check status const status = await fal.queue.status("fal-ai/minimax/speech-02-hd", { requestId: request_id, logs: true }); // Get result const result = await fal.queue.result("fal-ai/minimax/speech-02-hd", { requestId: request_id });
Voice Settings
The API provides extensive control over voice parameters:
javascript
{ voice_setting: { voice_id: string, // One of 300+ available voices speed: number, // Speed control (default: 1.0) vol: number, // Volume control (default: 1.0) pitch: number, // Pitch adjustment (default: 0) english_normalization: boolean // Improves number reading } }
Supported Languages
MiniMax TTS supports 30+ languages including:
- Chinese, Chinese (Yue/Cantonese)
- English, Spanish, French, German, Italian, Portuguese
- Japanese, Korean
- Arabic, Russian, Turkish, Dutch, Ukrainian
- Vietnamese, Indonesian, Thai, Hindi
- Polish, Romanian, Greek, Czech, Finnish
Output Formats
Available audio output formats:
- MP3 - Default format, good compression
- WAV - Uncompressed, high quality
- FLAC - Lossless compression
- PCM - Raw audio data
Best Practices
For optimal results with MiniMax TTS:
- Include proper punctuation in your input text for natural pausing and intonation
- Use the
`english_normalization`
flag for better number reading performance - Process up to 5,000 characters in real-time or 1 million characters asynchronously
- Cache frequently used audio outputs to optimize performance and costs
Error Handling
Implement robust error handling to manage API responses:
javascript
try { const result = await fal.subscribe("fal-ai/minimax/speech-02-hd", { input: { text: inputText } }); } catch (error) { console.error("Speech generation failed:", error.message); // Implement fallback behavior }
Available Models
MiniMax offers multiple TTS models:
- speech-02-hd: High-definition quality, best for production use
- speech-02-turbo: Optimized for real-time applications with low latency
- speech-01-hd: Previous generation HD model
- speech-01-turbo: Previous generation turbo model
Performance and Scaling
MiniMax TTS is built for production workloads with:
- Low latency response times for real-time applications
- High-throughput capability for batch processing
- Automatic scaling to handle varying loads
- Global CDN distribution for consistent performance
Pricing and Usage
- Cost: $0.05 per 1000 characters
- Transparent, usage-based pricing
- No subscription necessary
- No hidden fees or minimum commitments
View detailed pricing or contact sales for enterprise solutions.
Support Resources
We're here to help you succeed:
- API Documentation
- General fal.ai Documentation
- Technical support via email
- Regular updates and improvements
Start building with MiniMax TTS today and bring natural speech to your applications.