Google Veo 3 & Veo 3 Fast Price Cut! 50%+ off

MiniMax Speech-02 HD Text to Speech

fal-ai/minimax-tts/text-to-speech
Generate speech from text prompts and different voices using the MiniMax Speech-02 HD model, which leverages advanced AI techniques to create high-quality text-to-speech.
Inference
Commercial use
Partner

Input

Additional Settings

Customize your input with more control.

Result

Idle

Your request will cost $0.1 per 1000 character.

Logs

Readme

MiniMax TTS: Professional Text-to-Speech API

Transform your text into natural-sounding speech with MiniMax TTS, a powerful text-to-speech API designed for developers who need high-quality voice synthesis in their applications.

Overview

MiniMax TTS delivers studio-quality voice synthesis with advanced neural networks, offering natural prosody, clear articulation, and emotion-aware speech generation. Perfect for applications ranging from audiobook production to accessible content creation.

Key Capabilities

Generate lifelike speech with precise control over:

  • Voice characteristics and speaking style
  • Speech pacing and emotional tone
  • Multiple language support with natural accents (30+ languages)
  • Real-time speech synthesis for interactive applications
  • Over 300+ authentic voices
Getting Started

Getting up and running with MiniMax TTS takes just a few minutes. Here's how to begin:

  1. Install the client library:
bash
# Using npm
npm install --save @fal-ai/client

# Using pip
pip install fal-client
  1. Configure your API key:
javascript
import { fal } from "@fal-ai/client";

fal.config({
  credentials: "YOUR_FAL_KEY_HERE"
});
  1. Make your first API call:
javascript
const result = await fal.subscribe("fal-ai/minimax/speech-02-hd", {
  input: {
    text: "Welcome to MiniMax TTS. This is a demonstration of our natural speech synthesis."
  }
});
Technical Integration

MiniMax TTS supports both synchronous and queue-based processing. The API accepts plain text input and returns audio in multiple formats including MP3, WAV, FLAC, and PCM.

Advanced Usage

Control fine-grained speech parameters:

javascript
const response = await fal.subscribe("fal-ai/minimax/speech-02-hd", {
  input: {
    text: "This is an important announcement.",
    voice_setting: {
      voice_id: "Wise_Woman",
      speed: 1.0,
      vol: 1.0,
      pitch: 0,
      english_normalization: false
    },
    output_format: "mp3"
  }
});
Queue-Based Processing

For handling multiple requests or asynchronous workflows:

javascript
// Submit request
const { request_id } = await fal.queue.submit("fal-ai/minimax/speech-02-hd", {
  input: {
    text: "Your text here"
  },
  webhookUrl: "https://optional.webhook.url/for/results"
});

// Check status
const status = await fal.queue.status("fal-ai/minimax/speech-02-hd", {
  requestId: request_id,
  logs: true
});

// Get result
const result = await fal.queue.result("fal-ai/minimax/speech-02-hd", {
  requestId: request_id
});
Voice Settings

The API provides extensive control over voice parameters:

javascript
{
  voice_setting: {
    voice_id: string,          // One of 300+ available voices
    speed: number,             // Speed control (default: 1.0)
    vol: number,              // Volume control (default: 1.0)
    pitch: number,            // Pitch adjustment (default: 0)
    english_normalization: boolean  // Improves number reading
  }
}
Supported Languages

MiniMax TTS supports 30+ languages including:

  • Chinese, Chinese (Yue/Cantonese)
  • English, Spanish, French, German, Italian, Portuguese
  • Japanese, Korean
  • Arabic, Russian, Turkish, Dutch, Ukrainian
  • Vietnamese, Indonesian, Thai, Hindi
  • Polish, Romanian, Greek, Czech, Finnish
Output Formats

Available audio output formats:

  • MP3 - Default format, good compression
  • WAV - Uncompressed, high quality
  • FLAC - Lossless compression
  • PCM - Raw audio data
Best Practices

For optimal results with MiniMax TTS:

  • Include proper punctuation in your input text for natural pausing and intonation
  • Use the `english_normalization` flag for better number reading performance
  • Process up to 5,000 characters in real-time or 1 million characters asynchronously
  • Cache frequently used audio outputs to optimize performance and costs
Error Handling

Implement robust error handling to manage API responses:

javascript
try {
  const result = await fal.subscribe("fal-ai/minimax/speech-02-hd", {
    input: { text: inputText }
  });
} catch (error) {
  console.error("Speech generation failed:", error.message);
  // Implement fallback behavior
}
Available Models

MiniMax offers multiple TTS models:

  • speech-02-hd: High-definition quality, best for production use
  • speech-02-turbo: Optimized for real-time applications with low latency
  • speech-01-hd: Previous generation HD model
  • speech-01-turbo: Previous generation turbo model
Performance and Scaling

MiniMax TTS is built for production workloads with:

  • Low latency response times for real-time applications
  • High-throughput capability for batch processing
  • Automatic scaling to handle varying loads
  • Global CDN distribution for consistent performance
Pricing and Usage
  • Cost: $0.05 per 1000 characters
  • Transparent, usage-based pricing
  • No subscription necessary
  • No hidden fees or minimum commitments

View detailed pricing or contact sales for enterprise solutions.

Support Resources

We're here to help you succeed:

Start building with MiniMax TTS today and bring natural speech to your applications.