Z Image Turbo Image To Image API

Image To Image
Lora

Endpoint: POST https://fal.run/fal-ai/z-image/turbo/image-to-image Endpoint ID: fal-ai/z-image/turbo/image-to-image

Try it in the Playground

Run this model interactively with your own prompts.

Quick Start

import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
           print(log["message"])

result = fal_client.subscribe(
    "fal-ai/z-image/turbo/image-to-image",
    arguments={
        "prompt": "A young Asian woman with long, vibrant purple hair stands on a sunlit sandy beach, posing confidently with her left hand resting on her hip. She gazes directly at the camera with a neutral expression. A sleek black ribbon bow is tied neatly on the right side of her head, just above her ear. She wears a flowing white cotton dress with a fitted bodice and a flared skirt that reaches mid-calf, slightly lifted by a gentle sea breeze. The beach behind her features fine, pale golden sand with subtle footprints, leading to calm turquoise waves under a clear blue sky with soft, wispy clouds. The lighting is natural daylight, casting soft shadows to her left, indicating late afternoon sun. The horizon line is visible in the background, with a faint silhouette of distant dunes. Her skin tone is fair with a natural glow, and her facial features are delicately defined. The composition is centered on her figure, framed from mid-thigh up, with shallow depth of field blurring the distant waves slightly.",
        "image_url": "https://storage.googleapis.com/falserverless/example_inputs/z-image-turbo-i2i-input.png"
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)
print(result)

Input Schema

prompt

string

required

The prompt to generate an image from.

image_size

ImageSize | Enum

default:"auto"

The size of the generated image. Default value: autoPossible values: square_hd, square, portrait_4_3, portrait_16_9, landscape_4_3, landscape_16_9, auto

num_inference_steps

integer

default:"8"

The number of inference steps to perform. Default value: 8Range: 1 to 8

seed

integer

The same seed and the same prompt given to the same version of the model will output the same image every time.

sync_mode

boolean

default:"false"

If True, the media will be returned as a data URI and the output data won’t be available in the request history.

num_images

integer

default:"1"

The number of images to generate. Default value: 1Range: 1 to 4

enable_safety_checker

boolean

default:"true"

If set to true, the safety checker will be enabled. Default value: true

output_format

OutputFormatEnum

default:"png"

The format of the generated image. Default value: "png"Possible values: jpeg, png, webp

acceleration

AccelerationEnum

default:"regular"

The acceleration level to use. Default value: "regular"Possible values: none, regular, high

enable_prompt_expansion

boolean

default:"false"

Whether to enable prompt expansion. Note: this will increase the price by 0.0025 credits per request.

image_url

string

required

URL of Image for Image-to-Image generation.

strength

float

default:"0.6"

The strength of the image-to-image conditioning. Default value: 0.6

Output Schema

images

list<ImageFile>

required

The generated image files info.

timings

Timings

required

The timings of the generation process.

seed

integer

required

Seed of the generated Image. It will be the same value of the one passed in the input or the randomly generated that was used in case none was passed.

has_nsfw_concepts

list<boolean>

required

Whether the generated images contain NSFW concepts.

prompt

string

required

The prompt used for generating the image.

Input Example

{
  "prompt": "A young Asian woman with long, vibrant purple hair stands on a sunlit sandy beach, posing confidently with her left hand resting on her hip. She gazes directly at the camera with a neutral expression. A sleek black ribbon bow is tied neatly on the right side of her head, just above her ear. She wears a flowing white cotton dress with a fitted bodice and a flared skirt that reaches mid-calf, slightly lifted by a gentle sea breeze. The beach behind her features fine, pale golden sand with subtle footprints, leading to calm turquoise waves under a clear blue sky with soft, wispy clouds. The lighting is natural daylight, casting soft shadows to her left, indicating late afternoon sun. The horizon line is visible in the background, with a faint silhouette of distant dunes. Her skin tone is fair with a natural glow, and her facial features are delicately defined. The composition is centered on her figure, framed from mid-thigh up, with shallow depth of field blurring the distant waves slightly.",
  "image_size": "auto",
  "num_inference_steps": 8,
  "sync_mode": false,
  "num_images": 1,
  "enable_safety_checker": true,
  "output_format": "png",
  "acceleration": "regular",
  "enable_prompt_expansion": false,
  "image_url": "https://storage.googleapis.com/falserverless/example_inputs/z-image-turbo-i2i-input.png",
  "strength": 0.6
}

Output Example

{
  "images": [
    {
      "content_type": "image/png",
      "height": 1728,
      "url": "https://storage.googleapis.com/falserverless/example_outputs/z-image-turbo-i2i-output.png",
      "width": 992
    }
  ],
  "prompt": ""
}

Endpoint: POST https://fal.run/fal-ai/z-image/turbo/image-to-image/lora Endpoint ID: fal-ai/z-image/turbo/image-to-image/lora

Try it in the Playground

Run this model interactively with your own prompts.

Quick Start

import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
           print(log["message"])

result = fal_client.subscribe(
    "fal-ai/z-image/turbo/image-to-image/lora",
    arguments={
        "prompt": "A young Asian woman with long, vibrant purple hair stands on a sunlit sandy beach, posing confidently with her left hand resting on her hip. She gazes directly at the camera with a neutral expression. A sleek black ribbon bow is tied neatly on the right side of her head, just above her ear. She wears a flowing white cotton dress with a fitted bodice and a flared skirt that reaches mid-calf, slightly lifted by a gentle sea breeze. The beach behind her features fine, pale golden sand with subtle footprints, leading to calm turquoise waves under a clear blue sky with soft, wispy clouds. The lighting is natural daylight, casting soft shadows to her left, indicating late afternoon sun. The horizon line is visible in the background, with a faint silhouette of distant dunes. Her skin tone is fair with a natural glow, and her facial features are delicately defined. The composition is centered on her figure, framed from mid-thigh up, with shallow depth of field blurring the distant waves slightly.",
        "image_url": "https://storage.googleapis.com/falserverless/example_inputs/z-image-turbo-i2i-input.png"
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)
print(result)

Input Schema

prompt

string

required

The prompt to generate an image from.

image_size

ImageSize | Enum

default:"auto"

The size of the generated image. Default value: autoPossible values: square_hd, square, portrait_4_3, portrait_16_9, landscape_4_3, landscape_16_9, auto

num_inference_steps

integer

default:"8"

The number of inference steps to perform. Default value: 8Range: 1 to 8

seed

integer

The same seed and the same prompt given to the same version of the model will output the same image every time.

sync_mode

boolean

default:"false"

If True, the media will be returned as a data URI and the output data won’t be available in the request history.

num_images

integer

default:"1"

The number of images to generate. Default value: 1Range: 1 to 4

enable_safety_checker

boolean

default:"true"

If set to true, the safety checker will be enabled. Default value: true

output_format

OutputFormatEnum

default:"png"

The format of the generated image. Default value: "png"Possible values: jpeg, png, webp

acceleration

AccelerationEnum

default:"regular"

The acceleration level to use. Default value: "regular"Possible values: none, regular, high

enable_prompt_expansion

boolean

default:"false"

Whether to enable prompt expansion. Note: this will increase the price by 0.0025 credits per request.

image_url

string

required

URL of Image for Image-to-Image generation.

strength

float

default:"0.6"

The strength of the image-to-image conditioning. Default value: 0.6

loras

list<LoRAInput>

default:""

List of LoRA weights to apply (maximum 3).

Output Schema

images

list<ImageFile>

required

The generated image files info.

timings

Timings

required

The timings of the generation process.

seed

integer

required

Seed of the generated Image. It will be the same value of the one passed in the input or the randomly generated that was used in case none was passed.

has_nsfw_concepts

list<boolean>

required

Whether the generated images contain NSFW concepts.

prompt

string

required

The prompt used for generating the image.

Input Example

{
  "prompt": "A young Asian woman with long, vibrant purple hair stands on a sunlit sandy beach, posing confidently with her left hand resting on her hip. She gazes directly at the camera with a neutral expression. A sleek black ribbon bow is tied neatly on the right side of her head, just above her ear. She wears a flowing white cotton dress with a fitted bodice and a flared skirt that reaches mid-calf, slightly lifted by a gentle sea breeze. The beach behind her features fine, pale golden sand with subtle footprints, leading to calm turquoise waves under a clear blue sky with soft, wispy clouds. The lighting is natural daylight, casting soft shadows to her left, indicating late afternoon sun. The horizon line is visible in the background, with a faint silhouette of distant dunes. Her skin tone is fair with a natural glow, and her facial features are delicately defined. The composition is centered on her figure, framed from mid-thigh up, with shallow depth of field blurring the distant waves slightly.",
  "image_size": "auto",
  "num_inference_steps": 8,
  "sync_mode": false,
  "num_images": 1,
  "enable_safety_checker": true,
  "output_format": "png",
  "acceleration": "regular",
  "enable_prompt_expansion": false,
  "image_url": "https://storage.googleapis.com/falserverless/example_inputs/z-image-turbo-i2i-input.png",
  "strength": 0.6,
  "loras": []
}

Output Example

{
  "images": [
    {
      "content_type": "image/png",
      "height": 1728,
      "url": "https://storage.googleapis.com/falserverless/example_outputs/z-image-turbo-i2i-output.png",
      "width": 992
    }
  ],
  "prompt": ""
}

Tongyi-MAI’s Z-Image Turbo delivers image-to-image generation at $0.005 per megapixel through a 6-billion parameter architecture. Trading raw parameter count for inference optimization, this model processes transformations in 8 steps or fewer while maintaining commercial-grade output quality. Built for developers who need cost-effective image modification at scale without sacrificing control over the transformation process. Built for: Product variant generation | Style transfer workflows | Rapid prototyping iterations

Performance That Scales

At $0.005 per megapixel, Z-Image Turbo positions 10-15x more cost-effectively than premium image generation alternatives while maintaining the flexibility of adjustable inference steps.

Metric	Result	Context
Inference Steps	1-8 steps	Configurable via API, default 8 steps balances quality and speed
Cost per Megapixel	$0.005	200 megapixels per $1.00 on fal
Batch Size	Up to 4 images	Per request via `num_images` parameter
Strength Range	0.0-1.0	Default 0.6, lower values preserve more source structure
Related Endpoints	Z-Image Turbo LoRA	LoRA variant for custom style training

Image Transformation With Strength Control

Z-Image Turbo uses a diffusion-based architecture optimized for image-to-image conditioning, where you provide both a reference image and a text prompt to guide the transformation. Unlike pure text-to-image models that start from noise, this approach preserves structural elements from your input while applying the changes you specify. What this means for you:

Adjustable transformation intensity: Control how much the output diverges from your source image via the strength parameter (0-1 range), letting you dial in anything from subtle refinements to dramatic reimaginings
Multi-image batch processing: Generate up to 4 variations per request, useful for A/B testing different prompt variations or exploring creative options without separate API calls
Flexible resolution handling: Auto-sizing adapts to your input dimensions, with support for custom image sizes to match your workflow requirements
Accelerated inference options: Three acceleration levels (none, regular, high) let you trade generation time for cost based on your use case, prototype fast, then refine at full quality

Technical Specifications

Spec	Details
Architecture	Z-Image Turbo
Input Formats	Image URL (JPEG, PNG, WebP, GIF, AVIF) + text prompt
Output Formats	JPEG, PNG, WebP (configurable via `output_format`)
Model Size	6B parameters
License	Commercial use enabled

API Documentation | Quickstart Guide | Pricing

How It Stacks Up

Z-Image Turbo LoRA – Z-Image Turbo provides the base transformation engine at $0.005/megapixel, while the LoRA variant adds custom style training capabilities for specialized visual treatments. The LoRA endpoint trades base model simplicity for fine-tuned control when you need consistent brand aesthetics or specific artistic styles across multiple generations. FASHN Virtual Try-On V1.5 – Z-Image Turbo handles general-purpose image transformation with flexible prompt control for $0.005/megapixel. FASHN specializes in garment placement and fit visualization for e-commerce workflows where product accuracy matters more than creative flexibility. Image Editing endpoints (Age Progression, Wojak Style) – Z-Image Turbo offers broad transformation flexibility through natural language prompts, while specialized editing endpoints provide single-function transformations optimized for specific use cases. Choose Z-Image Turbo when you need multi-purpose image modification without switching between task-specific models.

Z-Image Turbo — Image Generation
Z Image Base — Image Generation
Z Image Base (LoRA) — Image Generation
Z-Image Turbo Seamless Tiling — Image Generation

Limitations

image_size restricted to: square_hd, square, portrait_4_3, portrait_16_9, landscape_4_3, landscape_16_9, auto
num_inference_steps range: 1 to 8
num_images range: 1 to 4
output_format restricted to: jpeg, png, webp
acceleration restricted to: none, regular, high
Content moderation via safety checker

Try it in the Playground

​Quick Start

​Input Schema

​Output Schema

​Input Example

​Output Example

Try it in the Playground

​Quick Start

​Input Schema

​Output Schema

​Input Example

​Output Example

​Performance That Scales

​Image Transformation With Strength Control

​Technical Specifications

​How It Stacks Up

​Related

​Limitations

Quick Start

Input Schema

Output Schema

Input Example

Output Example

Quick Start

Input Schema

Output Schema

Input Example

Output Example

Performance That Scales

Image Transformation With Strength Control

Technical Specifications

How It Stacks Up

Related

Limitations