Nano Banana 2 is here 🍌 4x faster, lower cost, better quality

Nano Banana 2 Text to Image

fal-ai/nano-banana-2
Nano Banana 2 is Google's new state-of-the-art fast image generation and editing model
Inference
Commercial use
Partner

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Your request will cost $0.08 per image. For $1.00, you can run this model 12 times. 2K and 4K outputs will be charged at 1.5 times and 2 times the standard rate, respectively. 0.5K (512px) resolution outputs will be charged at 0.75 times the standard rate. If web search is used, an additional $0.015 will be charged. Note: Pricing is subject to change.

Logs

Nano Banana 2 [text-to-image]

Google's Gemini 3.1 Flash Image architecture generates vibrant, high-fidelity visuals at speed, combining the reasoning capabilities of a multimodal foundation model with the efficiency of Flash-optimized inference. It understands creative intent holistically rather than matching keywords, producing images with accurate text rendering, character consistency, and coherent spatial composition in seconds.

Built for: Marketing campaigns and social media assets | Product photography and visualization | Designs requiring accurate in-image typography | Storyboarding with consistent characters across frames

Reasoning-Guided, Flash-Fast

Built on Google's Gemini 3.1 Flash Image foundation, Nano Banana 2 reasons about composition, lighting, and spatial relationships before rendering. Unlike traditional diffusion models that treat prompts as weighted tokens, this architecture interprets creative direction as a multimodal language model would, capturing nuance and context that single-modality systems miss, then executes at Flash-tier speed.

What this means for you:

  • Vibrant output: Rich color, punchy contrast, and visual coherence out of the box without post-processing
  • Accurate text rendering: Character-by-character validated typography in multiple languages, directly in generated images
  • Character consistency: Maintain identity for up to 5 people across generations for storyboarding and campaign work
  • Natural language control: Describe mood, style, and context conversationally without mastering prompt engineering syntax
  • Web-grounded generation: Optionally ground outputs in real-time web information for factually current visuals
Technical Specifications
SpecDetails
ArchitectureGemini 3.1 Flash Image (Nano Banana 2)
InputText prompts (natural language); up to 14 reference images for editing
Output FormatsPNG, JPEG, WebP
Resolution1K (default), 2K (1.5x rate), 4K (2x rate), 512x512 (0.75x rate)
Aspect Ratiosauto, 21:9, 16:9, 3:2, 4:3, 5:4, 1:1, 4:5, 3:4, 2:3, 9:16
Batch1-4 images per request
Character ConsistencyUp to 5 people across generations
WatermarkingSynthID digital watermarking on all outputs
Web SearchOptional grounding via `enable_web_search` or `enable_google_search`
LicenseCommercial use enabled through fal.ai

API Documentation

How It Stacks Up

vs. Nano Banana Pro (Gemini 3 Pro Image): Nano Banana 2 prioritizes speed and vibrant output on the Flash architecture, generating in seconds where Pro optimizes for maximum reasoning depth at $0.15/image. Choose Nano Banana 2 for fast iteration and production volume, Pro for maximum compositional complexity.

vs. FLUX.2 [dev]: Nano Banana 2 delivers semantic-aware generation with native text rendering and character consistency through Gemini's multimodal reasoning. FLUX.2 [dev] prioritizes resolution control and fine detail preservation for technical illustration workflows.

vs. Original Nano Banana (Gemini 2.5 Flash Image): Nano Banana 2 adds reasoning-guided generation, dramatically improved text rendering, native multi-resolution output (1K/2K/4K), character consistency, multi-image compositing, and web search grounding. A generational leap in quality while maintaining Flash-tier speed.