Nano Banana 2 is here 🍌 4x faster, lower cost, better quality

Nano Banana 2 Image to Image

fal-ai/nano-banana-2/edit
Nano Banana 2 is Google's new state-of-the-art image generation and editing model
Inference
Commercial use
Partner

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Your request will cost $0.08 per image. For $1.00, you can run this model 12 times. 2K and 4K outputs will be charged at 1.5 times and 2 times the standard rate, respectively. 0.5K (512px) resolution outputs will be charged at 0.75 times the standard rate. If web search is used, an additional $0.015 will be charged. Note: Pricing is subject to change.

Logs

Nano Banana 2 [image-to-image]

Google's Gemini 3.1 Flash Image architecture edits and transforms images with multimodal understanding, combining up to 14 reference images with natural language instructions to deliver precise, context-aware modifications at Flash-tier speed. It reasons about what you want changed and what should stay intact, producing edits that respect composition, lighting, and style coherence.

Built for: Product photo retouching and variation | Style transfer and creative remixing | Multi-image compositing and scene assembly | Iterative design workflows requiring fast turnaround

Edit with Intent, Not Masks

Built on Google's Gemini 3.1 Flash Image foundation, Nano Banana 2 Edit understands your editing instructions semantically. Instead of requiring manual masks or region selection, describe what you want changed in plain language and the model reasons about which elements to modify while preserving the rest. Supply up to 14 reference images for compositing, style guidance, or multi-subject scenes.

What this means for you:

  • Multi-image input: Combine up to 14 reference images in a single request for compositing, style matching, or subject transfer
  • Natural language editing: Describe edits conversationally - no masks, layers, or region coordinates needed
  • Context-aware preservation: The model understands what to change and what to leave untouched, maintaining coherence across the edit
  • Vibrant output: Rich color, punchy contrast, and visual fidelity carried through from source images
  • Web-grounded editing: Optionally ground edits in real-time web information via `enable_web_search` or `enable_google_search`
Technical Specifications
SpecDetails
ArchitectureGemini 3.1 Flash Image (Nano Banana 2)
InputText prompt (required) + up to 14 reference images (required)
Output FormatsPNG, JPEG, WebP
Resolution1K (default), 2K (1.5x rate), 4K (2x rate)
Aspect Ratiosauto, 21:9, 16:9, 3:2, 4:3, 5:4, 1:1, 4:5, 3:4, 2:3, 9:16
Batch1-4 images per request
WatermarkingSynthID digital watermarking on all outputs
Web SearchOptional grounding via `enable_web_search` or `enable_google_search`
LicenseCommercial use enabled through fal.ai

API Documentation

How It Stacks Up

vs. Nano Banana 2 Text-to-Image: The edit endpoint takes existing images as input alongside a text prompt, enabling modifications, compositing, and style transfer rather than generation from scratch. Use text-to-image for creating new visuals, edit for transforming existing ones.

vs. Nano Banana Pro Edit (Gemini 3 Pro Image): Nano Banana 2 Edit prioritizes speed and vibrant output on the Flash architecture, delivering edits in seconds where Pro optimizes for maximum reasoning depth. Choose Nano Banana 2 for fast iteration, Pro for complex multi-step edits requiring deeper compositional reasoning.

vs. FLUX.2 [dev] Image-to-Image: Nano Banana 2 Edit accepts up to 14 reference images with semantic understanding of edit instructions through Gemini's multimodal reasoning. FLUX.2 [dev] offers strength-based image-to-image with fine control over how much of the original to preserve.