Z-Image Turbo Text to Image
Input
Customize your input with more control.
Logs
Z-Image Turbo [text-to-image]
Tongyi-MAI's Z-Image Turbo delivers 6B-parameter text-to-image generation at $0.005 per megapixel. Trading model size for raw speed through an 8-step inference pipeline, it optimizes for high-volume applications where cost efficiency and throughput matter more than maximum detail fidelity.
Built for: Rapid prototyping | Content variation testing | High-volume asset generation
Speed-First Architecture for Volume Workflows
Z-Image Turbo compresses inference to 8 steps maximum (configurable down to 1), contrasting with standard diffusion models that typically require 20-50 steps for comparable output quality. The 6B parameter count keeps memory footprint lean while maintaining prompt adherence.
What this means for you:
- Flexible resolution up to 4 megapixels: Generate images from portrait to landscape orientations without resolution caps limiting your workflow
- Batch generation up to 4 images: Test multiple variations in a single API call for faster iteration cycles
- Configurable inference steps (1-8): Trade quality for speed based on your specific use case, use 1-step for thumbnails, 8-step for final assets
- Optional prompt expansion: Automatically enhance brief prompts with descriptive detail when you need richer outputs (adds $0.0025 per request)
Performance That Scales
Z-Image Turbo optimizes for cost-per-image economics in production environments where you're generating hundreds or thousands of assets.
| Metric | Result | Context |
|---|---|---|
| Cost per Megapixel | $0.005 | 200 megapixels per $1.00 on fal |
| Inference Steps | 1-8 configurable | Default 8 steps balances speed/quality |
| Max Resolution | Up to 4MP | Supports standard aspect ratios from square to ultrawide |
| Batch Size | 1-4 images | Generate variations in single request |
Technical Specifications
| Spec | Details |
|---|---|
| Architecture | Z-Image Turbo (6B parameters) |
| Input Formats | Text prompts with optional seed control |
| Output Formats | JPEG, PNG, WebP |
| Max Resolution | 4 megapixels (configurable aspect ratios) |
| Training | Z-Image Trainer available for LoRA fine-tuning |
| License | Commercial use permitted |
API Documentation | Quickstart Guide
How It Stacks Up
AuraFlow – Z-Image Turbo prioritizes inference speed and cost efficiency through its 8-step maximum pipeline, making it ideal for high-volume generation workflows where throughput matters. AuraFlow emphasizes prompt interpretation depth for complex creative briefs requiring nuanced understanding.
FLUX.2 [dev] – Z-Image Turbo trades parameter count (6B vs FLUX.2's larger architecture) for faster inference and lower cost per megapixel ($0.005 vs $0.012), positioning it for production environments generating thousands of assets. FLUX.2 [dev] offers higher detail fidelity and more sophisticated prompt following for applications where output quality justifies higher per-image costs.
FLUX.2 [pro] – Z-Image Turbo delivers 8-step generation at $0.005/MP for rapid iteration workflows. FLUX.2 [pro] provides enhanced photorealism and detail preservation at $0.03 for the first megapixel, designed for final production assets where maximum quality matters more than generation speed.
