Nano Banana Pro Image to Image
Input
Hint: Drag and drop files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL.
Customize your input with more control.
Result
What would you like to do next?
Your request will cost $0.15 per image. For $1.00, you can run this model 7 times. 4K outputs will be charged at double the standard rate. If web search is used, an additional $0.015 will be charged. Note: Pricing may change in the future.
Logs
Nano Banana 2 [image-editing]
Google's Gemini 3 Pro Image architecture transforms existing visuals at $0.15 per edit, bringing advanced reasoning and multimodal understanding to pixel-level manipulation. Trading raw speed for semantic awareness and studio-quality output, it interprets complex editing instructions like "make the sunset more dramatic while preserving the original mood" with no masks, no layers, just natural language directing precise transformations with enhanced text rendering and character consistency.
Built for: Product iteration workflows | Creative asset refinement | Context-aware photo editing | Multi-image composition
Semantic Editing Without Masks
Nano Banana 2 (aka Nano Banana Pro and officially Gemini 3 Pro Image) applies Google's advanced reasoning foundation model to image editing, understanding relationships between objects, lighting, and composition rather than treating pixels as isolated data points. This represents a significant upgrade from the original Nano Banana (Gemini 2.5 Flash Image), with enhanced capabilities for complex compositions and text rendering.
What this means for you:
- Natural language precision: "Change the car color to midnight blue while maintaining reflections" executes without manual selection. The model understands what "the car" means in context and preserves scene coherence
- Composition-aware transforms: Edits respect depth, perspective, and lighting automatically. No manual masking of shadows or reflections required
- Batch processing capability: Generate up to 4 variations simultaneously to explore creative directions
- Reference image support: Provide multiple reference images (combine up to 14 images) for style guidance or target aesthetics. The model interprets visual intent alongside text instructions
- Character consistency: Maintain resemblance and consistency for up to 5 people across edits
Performance Optimized for Quality
Built on Google's Gemini 3 Pro architecture, prioritizing quality and reasoning depth over raw speed.
| Metric | Result | Context |
|---|---|---|
| Generation Philosophy | Quality-first | Prioritizes complex compositions and accuracy over speed metrics |
| Cost per Image | $0.15 | ~7 edits per $1.00 on fal.ai; 4K outputs charged at 2x rate |
| Resolution Options | 1K, 2K, 4K | Configurable via API; higher resolutions increase token usage |
| Batch Size | 1-4 images | Via parameter |
| Architecture | Gemini 3 Pro Image | Multimodal foundation model with advanced reasoning |
Note: Generation times not publicly benchmarked; model optimized for quality rather than speed
Technical Specifications
| Spec | Details |
|---|---|
| Architecture | Gemini 3 Pro Image (Nano Banana 2) |
| Model Identifier | |
| Input Formats | Image URLs (required) + text prompt (required) |
| Output Formats | PNG, JPEG, WebP (configurable) |
| Resolution Options | 1K (1024px), 2K (2048px), 4K (higher cost) |
| Aspect Ratios | Auto, 21:9, 16:9, 3:2, 4:3, 5:4, 1:1, 4:5, 3:4, 2:3, 9:16 |
| Multi-Image Support | Combine up to 14 images in single composition |
| Character Consistency | Maintains resemblance for up to 5 people |
| Watermarking | SynthID digital watermarking on all outputs |
| License | Commercial use permitted |
How It Stacks Up
Compare Nano Banana 2 with:
FLUX.1 [dev] - Nano Banana 2 prioritizes semantic understanding through Gemini 3 Pro's reasoning architecture, making it ideal for complex transformations described in natural language without manual masking. FLUX.1 [dev] emphasizes maximum resolution control and fine detail preservation for precision editing workflows requiring technical control.
Stable Diffusion 3.5 - Nano Banana 2 leverages Google's production-scale multimodal training and advanced reasoning for context-aware edits that understand object relationships, composition, and maintain character consistency. Stable Diffusion 3.5 offers open-source flexibility for custom fine-tuning and local deployment in specialized editing pipelines.
DALL-E 3 - Nano Banana 2 provides superior text rendering capabilities and multi-image composition (up to 14 images) with professional creative controls. DALL-E 3 prioritizes safety filtering and artistic coherence for consumer-facing creative applications with stricter content guidelines.
Original Nano Banana (Gemini 2.5 Flash Image) - Nano Banana 2 trades speed for quality, offering enhanced reasoning, superior text rendering, better character consistency, professional-grade creative controls, and advanced composition capabilities at higher cost ($0.15 vs $0.039). Original Nano Banana remains available for rapid iterations and simple edits.


