Text To Image APIs
How does a text-to-image API call work on fal?
Text-to-image endpoints on fal return an array of image URLs. The call shape stays the same across all models on this page.
jsimport { fal } from "@fal-ai/client"; const result = await fal.subscribe("fal-ai/nano-banana-2", { input: { prompt: "A photorealistic Tokyo cafe at golden hour" } }); console.log(result.data.images[0].url);
You install `@fal-ai/client`, export `FAL_KEY`, and call any endpoint by its model string. Swapping to `openai/gpt-image-2` or `bytedance/seedream/v4.5/text-to-image` is then only a one-line change.
Which fal models render reliable typography inside images?
In-image typography is a real differentiator for a few models on fal.
- GPT Image 2 was built around accurate text rendering, with character-by-character precision across Latin scripts and CJK.
- Nano Banana 2 validates typography through Gemini 3.1 Flash Image’s reasoning step, with text generated as part of the scene.
- Nano Banana Pro extends that capability through the full Gemini 3 Pro Image foundation.
Use these models when you need readable text inside the image itself, such as posters, ads, product mockups, UI screens, packaging, logos, or multilingual creative assets.
Which models on fal are tuned for photoreal output?
The FLUX 1.x and FLUX 2 families anchor most photoreal workflows on fal.
- FLUX 1.1 [pro] ultra reaches 2K resolution with refined photorealism, fit for editorial and print work where surface texture matters.
- FLUX.2 [dev] delivers enhanced realism and detail control through Black Forest Labs’ second-generation architecture.
- GPT Image 2 reasons about lighting and material properties before rendering, which holds up in scenes with reflective surfaces and complex shadow geometry.
Pick these models when realism, lighting, surface detail, camera feel, and material accuracy matter more than raw generation speed.
Which fal models handle high-volume and fast generation?
Several models on fal are tuned for fast, low-cost text-to-image generation.
- FLUX.1 [schnell] runs in 1 to 4 inference steps and prices per megapixel, favorable for batches of small images.
- FLUX.2 [klein] 9B is a smaller FLUX.2 variant with the same editing capabilities at a lower parameter count.
- Z-Image Turbo is a 6B parameter model from Tongyi-MAI built for super fast text-to-image generation.
- Seedream 5.0 Lite is ByteDance’s fast tier of Seedream 5.0, with generation and editing optimized for speed.
- Nano Banana 2 sits in the middle ground when both speed and image quality factor into the call.
A common workflow is to use faster models for prompt exploration and high-volume drafts, then send only the best candidates to higher-fidelity models for final output.
How is text-to-image pricing structured on fal?
Pricing on text-to-image models is per output, with rates set by the model and resolution. Some models charge per megapixel, while others charge per image based on quality or size.
| Model | Price |
|---|---|
| FLUX.1 [schnell] | $0.003 / megapixel |
| Nano Banana 2 | $0.08 / image at 1K |
| Nano Banana Pro | $0.15 / image |
Nano Banana 2 supports higher-resolution pricing, with 2K and 4K outputs scaling at 1.5x and 2x the 1K rate.
You should also include add-ons in your calculations. Nano Banana 2 charges an extra $0.015 per generation for web search grounding and $0.002 for high-tier thinking, while GPT Image 2 uses quality-tier billing across low, medium, and high settings.