Z Image Trainer: LoRA Training for Z-Image Turbo

Z-Image Turbo LoRA trainer | [text-to-image]

Tongyi-MAI's Z-Image Turbo LoRA trainer delivers custom model fine-tuning at $2.26 per 1,000 training steps on a 6B parameter base. Trading generalist breadth for specialized precision, this training endpoint lets you encode specific visual styles or content patterns into reusable LoRA weights. Built for teams needing repeatable brand aesthetics, content creators maintaining visual consistency, and developers deploying style-controlled image generation at scale.

Use Cases: Brand-consistent image generation | Custom style deployment | Production visual workflows

Performance

Training cost scales linearly with step count, making iterative experimentation economically viable compared to training larger foundation models from scratch.

Metric	Result	Context
Base Model Size	6B parameters	Z-Image Turbo foundation optimized for speed
Training Cost	$2.26 per 1,000 steps	Scales linearly: 2,000 steps = $4.52, 5,000 steps = $11.30
Minimum Training	100 steps ($0.226)	Enables rapid prototyping iterations
Step Range	100-10,000 steps	Configurable in 100-step increments
Related Endpoints	Z Image Text to Image, Z Image with LoRA	Base inference and LoRA-enhanced generation variants

Training Control That Adapts to Your Use Case

Z Image Trainer exposes three distinct training modes: content, style, and balanced. This lets you bias the LoRA toward subject matter preservation or artistic treatment depending on your application requirements.

What this means for you:

Flexible caption handling: Supply per-image text files (ROOT.txt naming convention) or fall back to a default caption for the entire dataset, eliminating preprocessing bottlenecks
Configurable learning rate: Adjust the 0.0001 default to control training aggressiveness, balancing convergence speed against overfitting risk
Training mode selection: Choose content focus for subject consistency, style focus for artistic transfer, or balanced for general-purpose adaptation
Production-ready outputs: Receive diffusers-compatible LoRA weights and configuration files ready for immediate deployment in inference workflows

Technical Specifications

Spec	Details
Architecture	Z-Image Turbo
Input Formats	ZIP archive (images + optional .txt captions per image)
Output Formats	Diffusers LoRA weights, JSON config file
Training Steps	100-10,000 (configurable in 100-step increments)
License	Commercial use permitted

API Documentation | Quickstart Guide | Enterprise Pricing

How It Stacks Up

Z Image Text to Image ($0.039/image) – Z Image Trainer produces reusable LoRA weights for style consistency across unlimited generations, trading upfront training cost ($2.26 per 1,000 steps) for downstream inference efficiency. The base Z Image inference endpoint handles one-off generations where custom style encoding isn't required.

AuraFlow Text to Image ($0.055/image) – Z Image Trainer prioritizes speed-optimized training on a 6B parameter base for rapid iteration cycles. AuraFlow targets maximum output quality through a larger architecture, ideal for final production renders where training time investment isn't a constraint.

fal-ai/z-image-trainer

Input

Training history

Nothing here yet...

Z-Image Turbo LoRA trainer | [text-to-image]

Performance

Training Control That Adapts to Your Use Case

Technical Specifications

How It Stacks Up