fal-ai/kling-video/o3/4k/text-to-video
Input
Customize your input with more control.
Result
What would you like to do next?
For every second of video you generated, you will be charged $0.42 regardless of whether audio is on or off. For example, a 5s video will cost $2.10.
Logs
Run Kling Video O3 4K Text To Video API on fal
Kling's Native 4K is the world's first AI video model with native 4K output — cinema-grade visuals generated in a single step, with no post-production upscaling or third-party tools required. The O3 4K text-to-video endpoint is tuned for stylized and anime-leaning generation with crisp 4K clarity straight from a prompt. Built for: Stylized and anime-style 4K video, expressive character action, poster-quality key frames, and production-ready clips that skip the upscaling pipeline.
Pricing
Kling V3-Omni in 4K mode is billed per second of generated video.
| Configuration | Price per second |
|---|---|
| 4K mode, without video input, without native audio generation | $0.42 |
| 4K mode, without video input, with native audio generation | $0.42 |
A 5-second clip at 4K therefore costs $2.10; a 10-second clip costs $4.20.
Features
Kling O3 4K Text-to-Video produces cinema-grade 4K footage directly from a text prompt, with a bias toward stylized and anime-style output. Every frame is rendered with sophisticated lighting, atmosphere, and exceptional clarity, so output is ready for high-end delivery without a post pipeline. The model maintains stable reference consistency during 4K generation — stylistic expression, color, lighting, and overall mood stay faithful throughout the clip. Durations run from 3 to 15 seconds, aspect ratios cover 16:9, 9:16, and 1:1, and multi-shot storytelling is available through the `multi_prompt` interface. Audio is opt-in via `generate_audio`. If you want to learn more visit our kling o3 page.
Default prompt template
Scene: [where this happens, time of day, background, environment, style cues — e.g. anime, cel-shaded, painterly]
Subject: [who or what is the main focus, action, motion]
Important details: [camera movement, framing, lighting, color palette, atmosphere, pacing]
In-scene line: [optional spoken or shouted line, in quotes]
Use case: [anime sequence / stylized trailer / poster frame / concept reel / music video]
Constraints: [no watermark / no logos / preserve subject identity / steady camera]
Technical Specifications
| Spec | Details |
|---|---|
| Architecture | Kling Video O3 (Native 4K) |
| Input Formats | Text prompt, or a list of prompts for multi-shot generation |
| Output Format | MP4 video via URL |
| Resolution | Native 4K, no post-processing upscale |
| Duration Range | 3 to 15 seconds |
| Aspect Ratios | 16:9, 9:16, 1:1 |
| Audio | Optional native audio generation |
| License | Commercial use via fal Partner agreement |
What's New in Kling O3 4K
Industry-First Native 4K
One-click export for professional-grade 4K video. Output goes straight from the model at commercial 4K resolution — no separate upscaling pipeline, no quality degradation from chained models, and no third-party tools.
Stylized and Anime-Ready
Tuned for expressive, stylized output. Anime, cel-shaded, painterly, and illustrative looks hold together at 4K without losing line clarity or flattening to a photoreal bias.
Cinema-Grade Clarity
Ultra-clear visuals that faithfully capture every intricate detail. Sharpness, atmosphere, and lighting hit the bar for large-screen display and professional production workflows out of the box.
Greater Refinement
Richer color gradations and smoother transitions give footage a deeper sense of dimension. Fewer banding artifacts and cleaner highlight-to-shadow rolloff make the model suitable for cinematic grading.
Stable Reference Consistency
Throughout 4K generation, stylistic expression, color, lighting, and overall mood remain faithful — so a clip holds together as one continuous look rather than drifting shot to shot.
Multi-Shot Composition
Pass a list of prompts via `multi_prompt` to build a sequenced clip with distinct shots. `shot_type` controls whether cuts are user-defined (`customize`) or planned by the model.
Opt-In Native Audio
`generate_audio` defaults to `false` for O3 — turn it on when you want speech or ambient sound rendered with the video. Supports Chinese and English; other languages are translated to English automatically.
Efficient Workflow
Skip the render → upscale → export loop. A single API call produces a delivery-ready 4K asset.
Quick Start
Install the client
bashnpm install --save @fal-ai/client
Set your API key
bashexport FAL_KEY="YOUR_API_KEY"
Text to video
javascriptimport { fal } from "@fal-ai/client"; const result = await fal.subscribe("fal-ai/kling-video/o3/4k/text-to-video", { input: { prompt: "A mecha lands on the ground to save the city, and says \"I'm here\", in anime style", duration: "5", aspect_ratio: "16:9", generate_audio: true, }, logs: true, onQueueUpdate: (update) => { if (update.status === "IN_PROGRESS") { update.logs.map((log) => log.message).forEach(console.log); } }, }); console.log(result.data.video.url);
Multi-shot generation
javascriptconst result = await fal.subscribe("fal-ai/kling-video/o3/4k/text-to-video", { input: { multi_prompt: [ { prompt: "Establishing wide shot of a neon-lit city skyline at dusk, anime style.", duration: "4" }, { prompt: "Low-angle hero shot of a mecha landing on a rooftop, dust kicking up.", duration: "4" }, { prompt: "Close-up: the pilot narrows their eyes — \"I'm here.\"", duration: "3" }, ], shot_type: "customize", aspect_ratio: "16:9", generate_audio: true, }, });
API Reference
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
`prompt` | string | optional | Text prompt for video generation. Required unless `multi_prompt` is provided |
`multi_prompt` | array | optional | List of per-shot prompts for multi-shot generation |
`duration` | enum | `"5"` | Video duration in seconds. One of `"3"`–`"15"` |
`aspect_ratio` | enum | `"16:9"` | `16:9`, `9:16`, or `1:1` |
`generate_audio` | boolean | `false` | Generate native audio alongside the video |
`shot_type` | string | `"customize"` | Multi-shot mode, used with `multi_prompt` |
Output
json{ "video": { "file_name": "output.mp4", "content_type": "video/mp4", "url": "https://v3b.fal.media/files/...", "file_size": 13096952 } }
Use Cases
Anime and stylized shorts -- Episode clips, key-scene previsualization, and fan trailers rendered at delivery-grade 4K.
Poster-quality key frames -- High-impact stylized shots usable as hero frames, thumbnails, or campaign stills pulled from video.
Game and concept trailers -- Stylized action sequences and reveal spots without a separate upscaling stage.
Multi-shot storytelling -- Sequence a cold open, hero beat, and tag line with `multi_prompt` plus `shot_type`.
Music and social video -- 9:16 and 1:1 formats for vertical platforms, with optional native audio.
Large-screen and broadcast -- Stylized content mastered for high-definition playback and professional production pipelines.
Long-Running Requests
Video generation is a long-running job. Use the Queue API to submit asynchronously and retrieve results via webhook or polling.
javascriptconst { request_id } = await fal.queue.submit("fal-ai/kling-video/o3/4k/text-to-video", { input: { prompt: "..." }, webhookUrl: "https://your-server.com/webhook", }); const status = await fal.queue.status("fal-ai/kling-video/o3/4k/text-to-video", { requestId: request_id, logs: true, }); const result = await fal.queue.result("fal-ai/kling-video/o3/4k/text-to-video", { requestId: request_id, });
Notes
- Provide either
`prompt`or`multi_prompt`; a`prompt`is required unless`multi_prompt`is supplied `generate_audio`is off by default on O3 — set it to`true`to enable speech and ambient sound- For English speech, use lowercase for regular words and uppercase for acronyms and proper nouns
- Non-English / non-Chinese audio prompts are translated to English automatically
- When running client-side code, never expose your
`FAL_KEY`. Use a server-side proxy instead
cURL
bashcurl --request POST \ --url https://fal.run/fal-ai/kling-video/o3/4k/text-to-video \ --header "Authorization: Key $FAL_KEY" \ --header "Content-Type: application/json" \ --data '{ "prompt": "A mecha lands on the ground to save the city, and says \"I'\''m here\", in anime style", "duration": "5", "aspect_ratio": "16:9", "generate_audio": true }'
Python
pythonimport fal_client def on_queue_update(update): if isinstance(update, fal_client.InProgress): for log in update.logs: print(log["message"]) result = fal_client.subscribe( "fal-ai/kling-video/o3/4k/text-to-video", arguments={ "prompt": "A mecha lands on the ground to save the city, and says \"I'm here\", in anime style", "duration": "5", "aspect_ratio": "16:9", "generate_audio": True, }, with_logs=True, on_queue_update=on_queue_update, ) print(result)