Kling Video v2.6 Text to Video Text to Video

fal-ai/kling-video/v2.6/pro/text-to-video
Kling 2.6 Pro: Top-tier text-to-video with cinematic visuals, fluid motion, and native audio generation.
Inference
Commercial use
Partner

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

For every second of video you generated, you will be charged $0.07 (audio off) or $0.14 (audio on). For example, a 5s video with audio on will cost $0.70

Logs

Kling Video v2.6 Text to Video [text-to-video]

Kuaishou's Kling 2.6 Pro delivers cinematic text-to-video generation with native audio synthesis at $0.07 per second (audio off) or $0.14 per second (audio on). Trading speed for production quality, this model prioritizes fluid motion and visual fidelity over rapid iteration. Built for creators who need broadcast-ready video with synchronized soundscapes - no post-production audio layering required.

Built for: Marketing campaigns with voiceover | Social media content with dialogue | Cinematic storytelling with ambient audio


Cinematic Quality With Native Audio Generation

Kling 2.6 Pro breaks from the standard text-to-video workflow by generating synchronized audio directly alongside video - eliminating the separate audio production step that typically follows video generation. The model supports both 5-second and 10-second outputs with configurable aspect ratios (16:9, 9:16, 1:1) and handles bilingual voice output natively.

What this means for you:

  • Native audio synthesis: Generate video with dialogue, sound effects, and ambient audio in a single pass - supports English and Chinese voice output with automatic translation for other languages
  • Cinematic motion control: CFG scale from 0 to 1 lets you dial in how closely the model adheres to your prompt versus allowing creative interpretation for more natural motion
  • Flexible output formats: Choose 5 or 10-second durations across three aspect ratios (16:9 for landscape, 9:16 for vertical social, 1:1 for square formats)
  • Detailed prompt interpretation: Handles complex narrative prompts with multiple scene elements, character dialogue, and layered audio cues in a single generation

Performance That Scales

Kling 2.6 Pro's pricing model scales directly with video length and audio complexity - straightforward cost control for production budgets.

MetricResultContext
Cost per Second (Audio Off)$0.07 per second5s video = $0.35; 10s video = $0.70
Cost per Second (Audio On)$0.14 per second5s video with audio = $0.70; 10s video with audio = $1.40
Duration Options5s or 10sConfigurable via duration parameter
Aspect Ratios16:9, 9:16, 1:1Native support for landscape, vertical, and square formats

Technical Specifications

SpecDetails
ArchitectureKling 2.6 Pro
Input FormatsText prompts with optional negative prompts
Output FormatsMP4 video with optional native audio
Duration5 or 10 seconds
Aspect Ratios16:9, 9:16, 1:1
Audio SupportNative audio generation (English/Chinese voice, automatic translation)
LicenseCommercial use via fal

API Documentation


How It Stacks Up

Kling v2.5 Text to Video - Kling 2.6 Pro builds on v2.5's foundation with enhanced cinematic quality and refined motion dynamics, making it ideal for production-grade content where visual fidelity matters. Kling v2.5 prioritizes faster iteration cycles for rapid prototyping workflows.

Hunyuan Video V1.5 - Kling 2.6 Pro emphasizes native audio generation and bilingual voice support for complete narrative sequences. Hunyuan Video V1.5 focuses on visual generation without integrated audio, suitable for workflows where sound design happens separately.

Kling 2.1 Master - Kling 2.6 Pro represents a significant architecture evolution from the 2.1 Master generation, trading broader parameter control for refined output quality and streamlined audio integration. The 2.1 Master remains available for workflows requiring maximum customization flexibility.