Train custom LoRAs for the Qwen-Image-Layered model using structured zip archives containing base images with corresponding transparent layers. The API accepts learning rate, step count, and caption parameters, returning LoRA weights and config files.
Training Custom Layer Decomposition Models
Image layer decomposition separates a composite image into distinct, independently editable components while preserving visual coherence. Recent advances in diffusion-based generative models have enabled sophisticated approaches to this problem, with layered representations proving essential for precise content creation workflows1. The Qwen Image Layered Trainer on fal provides API access to train specialized LoRA weights for custom layer separation tasks.
The trainer produces LoRA (Low-Rank Adaptation) weights that modify how the Qwen-Image-Layered model performs decomposition. By training on domain-specific examples, developers can teach the model custom separation patterns for architectural elements, product isolation, or design component extraction. This guide covers the complete workflow from data preparation through inference integration.
Prerequisites
Before starting, ensure you have:
- A fal API key from your dashboard
- Training images in PNG or WebP format with transparency
- Python 3.8+ with
fal_clientor Node.js with@fal-ai/client
Python Setup:
import fal_client
import os
os.environ['FAL_KEY'] = 'your-api-key-here'
JavaScript Setup:
import { fal } from "@fal-ai/client";
fal.config({
credentials: "your-api-key-here",
});
For detailed authentication options, see the quickstart documentation.
falMODEL APIs
The fastest, cheapest and most reliable way to run genAI models. 1 API, 100s of models
Training Data Structure
The trainer requires a zip archive with specific naming conventions:
| File Pattern | Purpose | Required |
|---|---|---|
ROOT_start.EXT | Base composite image | Yes |
ROOT_end.EXT | First decomposed layer | Yes |
ROOT_end2.EXT through ROOT_end8.EXT | Additional layers (up to 8 total) | No |
ROOT.txt | Caption describing the decomposition | No |
All images within a group must share the same root name, use matching layer counts, and be PNG or WebP format to support alpha channel transparency.
Pre-flight Validation Checklist:
Before submitting a training job, verify your zip archive meets these requirements:
- All image groups have consistent layer counts
- File names follow the exact
ROOT_start/ROOT_endpattern - Images use PNG or WebP format only
- Either caption files exist for each group or you provide a
default_caption - Zip file is publicly accessible via URL
Python Integration
import fal_client
import time
from typing import Dict, Any
def train_layered_lora(
data_url: str,
learning_rate: float = 0.0001,
steps: int = 1000,
default_caption: str = None
) -> Dict[str, Any]:
arguments = {
"image_data_url": data_url,
"learning_rate": learning_rate,
"steps": steps
}
if default_caption:
arguments["default_caption"] = default_caption
handler = fal_client.submit(
"fal-ai/qwen-image-layered-trainer",
arguments=arguments
)
print(f"Training job submitted: {handler.request_id}")
while True:
status = handler.status()
if status == "COMPLETED":
return handler.get()
elif status == "FAILED":
raise Exception("Training failed")
time.sleep(30)
result = train_layered_lora(
data_url="https://your-storage.com/training-data.zip",
steps=1000,
default_caption="Product with transparent background layers"
)
lora_url = result['diffusers_lora_file']['url']
JavaScript Integration
import { fal } from "@fal-ai/client";
async function trainLayeredLoRA(config) {
const {
dataUrl,
learningRate = 0.0001,
steps = 1000,
defaultCaption = null,
} = config;
const input = {
image_data_url: dataUrl,
learning_rate: learningRate,
steps: steps,
};
if (defaultCaption) {
input.default_caption = defaultCaption;
}
const result = await fal.subscribe("fal-ai/qwen-image-layered-trainer", {
input: input,
logs: true,
onQueueUpdate: (update) => {
if (update.status === "IN_PROGRESS") {
console.log("Training in progress...");
}
},
});
return {
loraUrl: result.diffusers_lora_file.url,
configUrl: result.config_file.url,
};
}
API Parameters
| Parameter | Default | Range | Description |
|---|---|---|---|
image_data_url | Required | URL | Publicly accessible zip archive containing training data |
learning_rate | 0.0001 | 0.00005 to 0.0002 | Lower values produce conservative adaptations; higher values suit dramatic separation patterns |
steps | 1000 | 100 to 10000 | Training iterations; more steps improve quality but increase time linearly |
default_caption | None | String | Fallback description when individual .txt files are missing |
For current pricing, check the model page directly as rates may change.
Using Your Trained LoRA
After training completes, apply your LoRA weights using the inference endpoint at fal-ai/qwen-image-layered/lora:
result = fal_client.subscribe(
"fal-ai/qwen-image-layered/lora",
arguments={
"image_url": "https://your-image.png",
"num_layers": 4,
"loras": [{"path": lora_url}]
}
)
layers = result['images'] # Array of decomposed RGBA layer images
The inference endpoint accepts up to 3 LoRAs simultaneously, which are merged to produce the final decomposition. See the inference API documentation for complete parameter details.
Error Handling
from typing import Optional
import logging
def safe_train_lora(data_url: str, max_retries: int = 3, **kwargs) -> Optional[Dict]:
for attempt in range(max_retries):
try:
return train_layered_lora(data_url, **kwargs)
except fal_client.exceptions.ValidationError as e:
logging.error(f"Invalid input: {e}")
return None
except fal_client.exceptions.RateLimitError:
wait_time = 2 ** attempt * 60
logging.warning(f"Rate limited, waiting {wait_time}s")
time.sleep(wait_time)
except Exception as e:
if attempt == max_retries - 1:
raise
time.sleep(30)
return None
Common Errors:
| Error | Cause | Solution |
|---|---|---|
| ValidationError | Malformed zip structure | Verify file naming follows ROOT_start/ROOT_end pattern |
| ValidationError | Missing captions | Add .txt files or provide default_caption |
| ValidationError | Unsupported format | Convert images to PNG or WebP |
| RateLimitError | Too many concurrent requests | Implement exponential backoff |
For additional error patterns, see the FAQ documentation.
Production Considerations
Asynchronous Processing: For training jobs, avoid blocking on completion. Use the Queue API to submit jobs and webhooks to receive results:
handler = fal_client.submit(
"fal-ai/qwen-image-layered-trainer",
arguments=arguments,
webhook_url="https://your-server.com/webhook"
)
# Handler returns immediately; results delivered to webhook
Storage: Trained LoRA weights are hosted on fal infrastructure and accessible via the returned URL. For production deployments requiring persistent storage, download and host the weights in your own infrastructure.
Layer Complexity: Models trained with 2 to 3 layers converge faster than those handling 6 to 8 layer decompositions. Start with simpler structures when your use case permits.
Dataset Size: Keep individual training groups under 50 images for optimal performance. Larger datasets should be distributed across multiple training runs. Training time scales linearly with step count, so a 2000-step job takes approximately twice as long as a 1000-step job.
Training Data Best Practices
Effective layer decomposition training depends on high-quality input data. The model learns decomposition patterns from the relationships between your base images and their corresponding layers.
Caption Strategy: Descriptive, task-oriented captions outperform content-specific descriptions. "Product photography with transparent background and shadow layer" teaches decomposition better than "red sneaker on white." When training across diverse content, use the default_caption parameter to provide consistent task framing.
Layer Consistency: Maintain consistent semantic meaning for each layer position across your training set. If _end.png represents the primary subject in one image group, it should represent the primary subject in all groups. This consistency helps the model learn predictable decomposition behavior.
Resolution Considerations: The model accepts various input resolutions. While higher resolutions preserve more detail, they also increase training time. A resolution of 768x768 provides reasonable quality for most use cases. Match your training resolution to your expected inference resolution for best results.
Validation Set: Consider holding out 10-15% of your data to evaluate trained LoRA quality before production deployment. Compare decomposition results against your held-out ground truth to assess whether additional training steps would improve output.
Recently Added
References
-
Yang, J., Liu, Q., Li, Y., et al. "Generative Image Layer Decomposition with Visual Effects." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. https://openaccess.thecvf.com/content/CVPR2025/papers/Yang_Generative_Image_Layer_Decomposition_with_Visual_Effects_CVPR_2025_paper.pdf ↩

![Fine-tune FLUX.2 [dev] from Black Forest Labs with custom datasets. Create specialized LoRA adaptations for specific editing tasks.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2Frabbit%2FQQxycBXjY75hch-HBAQKZ_4af8ba3ddb9d457ba5fc51fcd428e720.jpg&w=3840&q=75)
![Fine-tune FLUX.2 [dev] from Black Forest Labs with custom datasets. Create specialized LoRA adaptations for specific styles and domains.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2Ftiger%2FnYv87OHdt503yjlNUk1P3_2551388f5f4e4537b67e8ed436333bca.jpg&w=3840&q=75)




















