Kling Video | Text to Video

Run Kling Video O3 4K Text To Video API on fal

Kling's Native 4K is the world's first AI video model with native 4K output — cinema-grade visuals generated in a single step, with no post-production upscaling or third-party tools required. The O3 4K text-to-video endpoint is tuned for stylized and anime-leaning generation with crisp 4K clarity straight from a prompt. Built for: Stylized and anime-style 4K video, expressive character action, poster-quality key frames, and production-ready clips that skip the upscaling pipeline.

Pricing

Kling V3-Omni in 4K mode is billed per second of generated video.

Configuration	Price per second
4K mode, without video input, without native audio generation	$0.42
4K mode, without video input, with native audio generation	$0.42

A 5-second clip at 4K therefore costs $2.10; a 10-second clip costs $4.20.

Features

Kling O3 4K Text-to-Video produces cinema-grade 4K footage directly from a text prompt, with a bias toward stylized and anime-style output. Every frame is rendered with sophisticated lighting, atmosphere, and exceptional clarity, so output is ready for high-end delivery without a post pipeline. The model maintains stable reference consistency during 4K generation — stylistic expression, color, lighting, and overall mood stay faithful throughout the clip. Durations run from 3 to 15 seconds, aspect ratios cover 16:9, 9:16, and 1:1, and multi-shot storytelling is available through the `multi_prompt` interface. Audio is opt-in via `generate_audio`. If you want to learn more visit our kling o3 page.

Default prompt template

Scene: [where this happens, time of day, background, environment, style cues — e.g. anime, cel-shaded, painterly]

Subject: [who or what is the main focus, action, motion]

Important details: [camera movement, framing, lighting, color palette, atmosphere, pacing]

In-scene line: [optional spoken or shouted line, in quotes]

Use case: [anime sequence / stylized trailer / poster frame / concept reel / music video]

Constraints: [no watermark / no logos / preserve subject identity / steady camera]

Technical Specifications

Spec	Details
Architecture	Kling Video O3 (Native 4K)
Input Formats	Text prompt, or a list of prompts for multi-shot generation
Output Format	MP4 video via URL
Resolution	Native 4K, no post-processing upscale
Duration Range	3 to 15 seconds
Aspect Ratios	16:9, 9:16, 1:1
Audio	Optional native audio generation
License	Commercial use via fal Partner agreement

API Documentation

What's New in Kling O3 4K

Industry-First Native 4K

One-click export for professional-grade 4K video. Output goes straight from the model at commercial 4K resolution — no separate upscaling pipeline, no quality degradation from chained models, and no third-party tools.

Stylized and Anime-Ready

Tuned for expressive, stylized output. Anime, cel-shaded, painterly, and illustrative looks hold together at 4K without losing line clarity or flattening to a photoreal bias.

Cinema-Grade Clarity

Ultra-clear visuals that faithfully capture every intricate detail. Sharpness, atmosphere, and lighting hit the bar for large-screen display and professional production workflows out of the box.

Richer color gradations and smoother transitions give footage a deeper sense of dimension. Fewer banding artifacts and cleaner highlight-to-shadow rolloff make the model suitable for cinematic grading.

Stable Reference Consistency

Throughout 4K generation, stylistic expression, color, lighting, and overall mood remain faithful — so a clip holds together as one continuous look rather than drifting shot to shot.

Multi-Shot Composition

Pass a list of prompts via `multi_prompt` to build a sequenced clip with distinct shots. `shot_type` controls whether cuts are user-defined (`customize`) or planned by the model.

Opt-In Native Audio

`generate_audio` defaults to `false` for O3 — turn it on when you want speech or ambient sound rendered with the video. Supports Chinese and English; other languages are translated to English automatically.

Efficient Workflow

Skip the render → upscale → export loop. A single API call produces a delivery-ready 4K asset.

Quick Start

Install the client

bash
npm install --save @fal-ai/client

Set your API key

bash
export FAL_KEY="YOUR_API_KEY"

Text to video

javascript
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/kling-video/o3/4k/text-to-video", {
  input: {
    prompt: "A mecha lands on the ground to save the city, and says \"I'm here\", in anime style",
    duration: "5",
    aspect_ratio: "16:9",
    generate_audio: true,
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});

console.log(result.data.video.url);

Multi-shot generation

javascript
const result = await fal.subscribe("fal-ai/kling-video/o3/4k/text-to-video", {
  input: {
    multi_prompt: [
      { prompt: "Establishing wide shot of a neon-lit city skyline at dusk, anime style.", duration: "4" },
      { prompt: "Low-angle hero shot of a mecha landing on a rooftop, dust kicking up.", duration: "4" },
      { prompt: "Close-up: the pilot narrows their eyes — \"I'm here.\"", duration: "3" },
    ],
    shot_type: "customize",
    aspect_ratio: "16:9",
    generate_audio: true,
  },
});

API Reference

Input

Parameter	Type	Default	Description
`prompt`	string	optional	Text prompt for video generation. Required unless `multi_prompt` is provided
`multi_prompt`	array	optional	List of per-shot prompts for multi-shot generation
`duration`	enum	`"5"`	Video duration in seconds. One of `"3"`–`"15"`
`aspect_ratio`	enum	`"16:9"`	`16:9`, `9:16`, or `1:1`
`generate_audio`	boolean	`false`	Generate native audio alongside the video
`shot_type`	string	`"customize"`	Multi-shot mode, used with `multi_prompt`

Output

json
{
  "video": {
    "file_name": "output.mp4",
    "content_type": "video/mp4",
    "url": "https://v3b.fal.media/files/...",
    "file_size": 13096952
  }
}

Use Cases

Anime and stylized shorts -- Episode clips, key-scene previsualization, and fan trailers rendered at delivery-grade 4K.

Poster-quality key frames -- High-impact stylized shots usable as hero frames, thumbnails, or campaign stills pulled from video.

Game and concept trailers -- Stylized action sequences and reveal spots without a separate upscaling stage.

Multi-shot storytelling -- Sequence a cold open, hero beat, and tag line with `multi_prompt` plus `shot_type`.

Music and social video -- 9:16 and 1:1 formats for vertical platforms, with optional native audio.

Large-screen and broadcast -- Stylized content mastered for high-definition playback and professional production pipelines.

Long-Running Requests

Video generation is a long-running job. Use the Queue API to submit asynchronously and retrieve results via webhook or polling.

javascript
const { request_id } = await fal.queue.submit("fal-ai/kling-video/o3/4k/text-to-video", {
  input: { prompt: "..." },
  webhookUrl: "https://your-server.com/webhook",
});

const status = await fal.queue.status("fal-ai/kling-video/o3/4k/text-to-video", {
  requestId: request_id,
  logs: true,
});

const result = await fal.queue.result("fal-ai/kling-video/o3/4k/text-to-video", {
  requestId: request_id,
});

Notes

Provide either `prompt` or `multi_prompt`; a `prompt` is required unless `multi_prompt` is supplied
`generate_audio` is off by default on O3 — set it to `true` to enable speech and ambient sound
For English speech, use lowercase for regular words and uppercase for acronyms and proper nouns
Non-English / non-Chinese audio prompts are translated to English automatically
When running client-side code, never expose your `FAL_KEY`. Use a server-side proxy instead

cURL

bash
curl --request POST \
  --url https://fal.run/fal-ai/kling-video/o3/4k/text-to-video \
  --header "Authorization: Key $FAL_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "prompt": "A mecha lands on the ground to save the city, and says \"I'\''m here\", in anime style",
    "duration": "5",
    "aspect_ratio": "16:9",
    "generate_audio": true
  }'

Python

python
import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
            print(log["message"])

result = fal_client.subscribe(
    "fal-ai/kling-video/o3/4k/text-to-video",
    arguments={
        "prompt": "A mecha lands on the ground to save the city, and says \"I'm here\", in anime style",
        "duration": "5",
        "aspect_ratio": "16:9",
        "generate_audio": True,
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)
print(result)

fal-ai/kling-video/o3/4k/text-to-video

Input

Result

What would you like to do next?

Logs

Run Kling Video O3 4K Text To Video API on fal

Pricing

Features

Default prompt template

Technical Specifications

What's New in Kling O3 4K

Industry-First Native 4K

Stylized and Anime-Ready

Cinema-Grade Clarity

Greater Refinement

Stable Reference Consistency

Multi-Shot Composition

Opt-In Native Audio

Efficient Workflow

Quick Start

Install the client

Set your API key

Text to video

Multi-shot generation

API Reference

Input

Output

Use Cases

Long-Running Requests

Notes

cURL

Python