Nano Banana 2 is now live! šŸŒ

Nano Banana Pro Text to Image

fal-ai/nano-banana-pro
Nano Banana Pro (a.k.a Nano Banana 2) is Google's new state-of-the-art image generation and editing model
Inference
Commercial use
Partner

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Your request will cost $0.15 per image. For $1.00, you can run this model 7 times. 4K outputs will be charged at double the standard rate. Note: Pricing may change in the future.

Logs

Nano Banana 2 [text-to-image]

Google's Gemini 3 Pro Image architecture delivers production-quality visuals at $0.15 per image—understanding context like a multimodal foundation model, not keyword matching like traditional diffusion systems. Trading raw speed for sophisticated semantic interpretation and enhanced reasoning capabilities, it transforms complex creative direction into accurate visuals without prompt engineering gymnastics, making it ideal for teams that need studio-quality results with advanced text rendering and character consistency.

Built for: Marketing campaign generation | Product visualization workflows | Creative content production requiring text accuracy | Infographic and diagram creation at scale

Beyond CLIP: Multimodal Understanding

Built on Google's Gemini 3 Pro foundation, Nano Banana Pro processes prompts through the same multimodal architecture that powers conversational AI understanding nuance, context, and creative intent rather than simple keyword matching. Where traditional diffusion models treat prompts as collections of weighted tokens, this approach interprets your creative direction holistically, capturing relationships between concepts that single-modality systems miss.

What this means for you:

  • Semantic accuracy: Generates images that match creative intent, not just literal prompt keywords understanding "1960s aesthetic" means grain, color palette, and composition choices
  • Reduced iteration cycles: First-generation outputs align with complex briefs, cutting revision rounds compared to keyword-dependent models
  • Batch efficiency: Process approximately 7 generations per dollar with consistent quality across variations, making A/B testing and campaign asset creation economically viable
  • Natural language control: Direct the model with conversational prompts describing mood, style, and context without mastering prompt engineering syntax
  • Advanced text rendering: Industry-leading text generation capabilities for creating legible text in multiple languages, fonts, and calligraphy styles directly within images

Performance Optimized for Quality

Google's multimodal foundation prioritizes quality and reasoning depth over raw speed, optimized for production workflows requiring sophisticated outputs.

MetricResultContext
Cost per Image$0.15~7 generations per $1.00 on fal.ai 4K outputs will be charged at double the standard rate
ArchitectureGemini 3 Pro ImageMultimodal foundation model with enhanced reasoning
Generation PhilosophyQuality-firstPrioritizes complex compositions and accuracy over speed
Batch ProcessingMultiple images supportedVia `num_images` parameter in API
Resolution Options1K, 2K, 4KConfigurable via API

Note: Generation times not publicly benchmarked by Google; model optimized for quality rather than speed metrics

Technical Specifications

SpecDetails
ArchitectureGemini 3 Pro Image (Nano Banana Pro)
Input FormatsText prompts with natural language support; multi-image blending (up to 14 images)
Output FormatsPNG, JPEG, WebP image files
Resolution OptionsMultiple aspect ratios including 21:9, 16:9, 3:2, 4:3, 5:4, 1:1, 4:5, 3:4, 2:3, 9:16
Character ConsistencyMaintains consistency and resemblance for up to 5 people across generations
WatermarkingSynthID digital watermarking on all outputs; visible watermark for non-Ultra subscribers
LicenseCommercial use enabled through fal.ai
Launch DateNovember 20, 2025

API Documentation

How It Stacks Up

vs. FLUX.1 [dev]: Nano Banana Pro achieves semantic-aware generation with industry-leading text rendering through Gemini 3 Pro's multimodal reasoning, making it ideal for marketing materials requiring accurate typography. FLUX.1 [dev] prioritizes maximum resolution control and fine detail preservation for technical illustration workflows.

vs. Stable Diffusion 3.5: Nano Banana Pro achieves natural language interpretation and real-world knowledge integration through Gemini architecture, making it ideal for teams creating infographics and data visualizations without prompt engineering expertise. Stable Diffusion 3.5 prioritizes open-source flexibility for custom fine-tuning and on-premise deployment scenarios.

vs. Original Nano Banana (Gemini 2.5 Flash Image): Nano Banana Pro trades speed for quality, offering enhanced reasoning, superior text rendering, better character consistency, and advanced composition capabilities. Original Nano Banana remains available for rapid iterations and simple edits at lower cost ($0.039/image).