Kling O1 Reference Video to Video [Pro] Video to Video
Input
Type @ to reference relevant media.
Reference as @Video1 in your prompt
Hint: Drag and drop video files from your computer, video from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: mp4, mov, webm, m4v, gif
Hint: Drag and drop files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL.
Reference as @Element1 in your prompt
Hint: Drag and drop image files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: jpg, jpeg, png, webp, gif, avif

Hint: Drag and drop files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL.
Customize your input with more control.
Logs
Kling O1: Reference Video to Video [video generation]
Kuaishou's Kling O1 Omni delivers reference-guided video generation at $0.168 per second, preserving cinematic language across scene transitions. Trading general-purpose video generation for shot continuity control, the model maintains motion patterns and camera style from source footage. Built for filmmakers extending existing sequences, content creators maintaining visual consistency across cuts, and production teams generating coherent multi-shot narratives.
Built for: Sequential shot generation | Cinematic style preservation | Reference-guided video continuity
Reference-Driven Continuity Architecture
Kling O1 Omni approaches video-to-video generation through reference preservation rather than style transfer alone. Where standard video generation models interpret prompts independently, this architecture anchors new frames to input video characteristics. Camera movement, motion dynamics, and visual language remain consistent across generated shots.
What this means for you:
- Cinematic continuity: Generate follow-up shots that maintain the motion style and camera language of your reference video, not just visual aesthetics. Use @Video1 in prompts to reference your input video: "Based on @Video1, generate the next shot. keep the style of the video"
- Multi-modal input: Combine reference video with up to 4 additional elements (character images with frontal_image_url + reference_image_urls arrays) and style reference images for precise control over scene composition
- Flexible duration: Output 5-second ($0.84) or 10-second ($1.68) clips at resolutions from 720px to 2160px. Output duration is independent of input duration (3-10 seconds)
- Automatic aspect ratio: Default "auto" mode detects optimal aspect ratio from input video, or manually select 16:9, 9:16, or 1:1
- Audio preservation: Option to retain original audio from reference video via keep_audio parameter, maintaining soundtrack continuity across generated sequences
Performance That Scales
Kling O1 Omni positions as a specialized tool for narrative video work, trading speed for reference-guided precision.
| Metric | Result | Context |
|---|---|---|
| Cost per Video | $0.84 (5s) or $1.68 (10s) | Based on $0.168 per second rate |
| Duration Options | 5-10 seconds output | Independent of 3-10 second input duration |
| Resolution Range | 720-2160px | Supports HD to 4K output with aspect ratio control |
| Reference Capacity | Video + 4 elements/images | Combine source video with character/object references |
Technical Specifications
| Spec | Details |
|---|---|
| Architecture | Kling O1 Omni |
| Input Formats | Video: .mp4, .mov (3-10 seconds, max 200MB); Reference images: .jpg, .jpeg, .png, .webp, .gif, .avif |
| Output Formats | .mp4 video with optional audio |
| Aspect Ratios | auto (default), 16:9, 9:16, 1:1 |
| Audio Handling | Optional audio preservation via keep_audio parameter (default: false) |
| Reference System | @Video1 for reference video, @Element1-4 for characters/objects (frontal + reference angles), @Image1-4 for style references |
| License | Commercial use via fal partner agreement |
How It Stacks Up
Sora 2 Video to Video - Kling O1 Omni prioritizes reference-based continuity for maintaining cinematic language across shots, ideal for sequential narrative work where motion style consistency matters. Sora 2 Video to Video emphasizes remix capabilities with broader creative transformation options for exploratory video generation workflows.
Wan Video to Video - Kling O1 Omni's multi-element reference system (video + 4 additional inputs) enables precise character and object control for narrative sequences. Wan Video to Video offers alternative approaches to video transformation with different architectural priorities for style and motion handling.
AnimateDiff Video to Video - Kling O1 Omni maintains camera language and motion dynamics from reference footage for professional continuity requirements. AnimateDiff Video to Video provides animation-focused video generation with distinct motion synthesis capabilities suited for stylized content creation.

