FLUX.2 is now live!

Kling O1 Image Image to Image

fal-ai/kling-image/o1
Perform precise image edits using strong reference control, transforming subjects, styles, and local details while preserving visual consistency.
Inference
Commercial use
Partner

Input

Type @ to reference images, elements, or video.

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Your request will cost $0.028 per image.

Logs

Kling O1 Image [image-to-image]

Kwai's Kling O1 Image delivers precise multi-reference image editing at $0.028 per image, supporting up to 10 reference images with resolution options to 2K. Trading raw speed for semantic accuracy through reference-based composition, it handles complex transformations like "Put @Image1 to the back seat of the car in @Image2" where context preservation matters more than generation time. Built for developers who need consistent visual logic across multiple input sources without manual masking or regional controls.

Built for: Multi-image composition workflows | Subject transplantation with context preservation | Style transfer across reference sets


Reference-Based Editing Without Manual Masks

Kling O1 Image uses a numbered reference system (@Image1, @Image2, etc.) to interpret relationships between up to 10 source images, eliminating the need for manual masking or regional prompting common in traditional image-to-image models. The architecture prioritizes semantic understanding - knowing what "put @Image1 in @Image2" means contextually rather than requiring pixel-level editing instructions.

What this means for you:

  • Multi-reference composition: Process up to 10 input images in a single prompt with explicit @Image syntax for precise source control
  • Contextual transplantation: Move subjects between scenes while maintaining lighting, perspective, and visual consistency without layer management
  • Resolution flexibility: Generate at 1K (standard) or 2K (high-resolution) with intelligent aspect ratio detection across 9 preset ratios
  • Batch generation: Output 1-9 variations per request with consistent reference interpretation across all outputs

Performance That Scales

Kling O1 Image positions as a precision editing tool rather than a speed-optimized generator, with pricing reflecting the multi-reference processing overhead.

MetricResultContext
Cost per Image$0.02836 generations per $1.00 on fal
Max Input Images10 imagesHighest multi-reference capacity in fal's image-to-image category
Resolution Options1K / 2K2K mode for high-resolution outputs up to 4 megapixels
Output FormatsJPEG, PNG, WebPFormat selection controls file size vs quality tradeoff

Technical Specifications

SpecDetails
ArchitectureKling O1 Image
Input FormatsUp to 10 reference images via URL (HTTPS), referenced as @Image1-@Image10 in prompts
Output FormatsJPEG, PNG, WebP
Max Resolution2K (high-resolution mode)
Prompt LengthUp to 2,500 characters with inline @Image references
LicenseCommercial use (Partner)

API Documentation


How It Stacks Up

NAFNet-deblur Image to Image - Kling O1 Image prioritizes multi-reference semantic editing for compositional transformations, making it ideal for workflows requiring context-aware subject transplantation across multiple source images. NAFNet-deblur focuses on single-image restoration tasks like motion blur removal and noise reduction, where reference-based editing isn't required and restoration quality is the primary metric.