fal-ai/trellis-2-lora-trainer
Input
Hint: Upload a prepared .zip archive or provide a URL. See the field description for the required file layout.
Each request is billed a one-time dataset-preparation charge of $0.10023 per input object (10-object minimum, waived when you reuse a prepared dataset) plus a per-denoiser training charge of $8.34 / $3.25 / $3.38 per 1,000 steps for sparse_structure / geometry / texture, scaled by your rank (×0.5 / ×1 / ×2 for 16 / 32 / 64) and resolution (×1 / ×2 for 512 / 1024).
Training history
Nothing here yet...
Fine-tune your training parameters and start right now.
TRELLIS.2 LoRA Trainer
Overview
`trellis-2-lora-trainer` trains one or more TRELLIS.2 LoRA adapters from a dataset of 3D assets. The trained adapters can be used with `trellis-2-lora` to customize image-to-3D generation.
Each request can train any combination of the three TRELLIS.2 stages:
`sparse_structure`- coarse 3D structure and silhouette.`geometry`- detailed shape and surface form.`texture`- color, material, and surface appearance.
When more than one stage is requested, the dataset is preprocessed once and then every selected stage is trained in parallel from that same processed data.
The trainer accepts either raw 3D assets or a reusable preprocessed dataset returned by a previous run. Every successful run returns the trained adapters and the preprocessed dataset used for training.
Key features:
- Trains stage-specific TRELLIS.2 LoRA adapters.
- Trains 1-3 stages per request, in parallel, from a single preprocessing pass.
- Accepts raw
`.zip`archives of 3D files. - Automatically preprocesses raw assets into a reusable TRELLIS.2 training dataset.
- Returns the preprocessed dataset so later runs can skip the expensive first pass.
- Lets you train sparse-structure, geometry, and texture adapters together or independently from the same processed data.
Dataset Format
Provide a URL to a `.zip` archive through `data_url`.
Raw datasets may contain supported 3D assets in the archive root or nested folders. Supported file types:
`.glb``.gltf``.obj``.fbx``.stl``.ply``.blend`
The archive may include an optional `manifest.csv`. This endpoint does not expose captions, prompts, or trigger phrases; training is based on the 3D assets and the image-condition views generated during preprocessing.
You may also pass a TRELLIS.2 preprocessed dataset zip returned as `preprocessed_data_file` by an earlier run.
Do not mix raw 3D assets and a preprocessed dataset in the same archive. The request expects one or the other.
Input Parameters Reference
Dataset
`data_url` (required)
Type: `string`
URL to a `.zip` archive containing raw 3D assets or a TRELLIS.2 preprocessed dataset zip returned by this trainer.
On the first run with raw assets, the trainer preprocesses the dataset and returns `preprocessed_data_file`. Use that returned file as `data_url` for later runs to train other LoRA types without repeating the raw-asset processing pass.
The archive download limit is 8192 MB, and the extracted dataset is capped at 8 GB.
LoRA Target
`denoisers`
Type: `array of string`
Default: `["texture"]`
Allowed values (each entry): `"sparse_structure"`, `"geometry"`, `"texture"`
Constraints: 1-3 unique entries.
TRELLIS.2 stages to fine-tune. One LoRA adapter is trained per entry. When you list more than one stage, all of them are trained in parallel from the same preprocessed dataset, and every shared training parameter below applies to each.
| Value | What it learns | Use the result in |
|---|---|---|
`sparse_structure` | Coarse 3D occupancy, silhouette, object proportions | `sparse_structure_lora_url` |
`geometry` | Detailed shape, surface form, structural details | `geometry_lora_url` |
`texture` | Colors, materials, surface appearance | `texture_lora_url` |
Training Parameters
These parameters are shared across every selected denoiser.
`resolution`
Type: `integer`
Default: `512`
Allowed values: `512`, `1024`
Training resolution. `1024` produces higher-detail geometry and texture adapters but is slower and more expensive. Sparse-structure training is resolution-independent (it uses the same config at both values); geometry and texture have dedicated `512`/`1024` configs.
The dataset is preprocessed at this resolution, so when you reuse a `preprocessed_data_file` you must request the same `resolution` it was prepared at. Each returned adapter echoes its `resolution`; use the same value in `trellis-2-lora` inference.
`rank`
Type: `integer`
Default: `32`
Allowed values: `16`, `32`, `64`
LoRA capacity. Higher rank can learn more detail, but it can also make the adapter more dataset-specific.
| Value | Use Case |
|---|---|
`16` | Smaller adapter capacity; useful for simple or narrow changes |
`32` | Balanced default |
`64` | More capacity for complex geometry or appearance patterns |
`learning_rate`
Type: `number`
Default: `0.0001`
Range: `0.000001` to `0.01`
Optimization step size for LoRA training. The default is the recommended starting point unless you have a reason to tune it.
`training_steps`
Type: `integer`
Default: `1000`
Range: `100` to `10000`
Number of training steps (applied to each selected denoiser).
Increase steps if the adapter effect is too weak. Reduce steps if the adapter overfits and forces outputs to look too much like individual training assets.
Outputs
`adapters`
Type: `array of object`
One entry per denoiser that trained successfully, in the order they were requested. Each entry has:
| Field | Type | Description |
|---|---|---|
`denoiser` | `string` | The stage this adapter was trained for (`sparse_structure`, `geometry`, or `texture`). Use it to pick the matching `trellis-2-lora` field. |
`resolution` | `integer` | Resolution this adapter was trained at (`512` or `1024`). Use the same value in `trellis-2-lora` inference. |
`lora_file` | `file` | Trained `.safetensors` LoRA adapter for this stage. |
`rank` | `integer` | LoRA rank used for training, echoed from the request. |
`training_steps` | `integer` | Number of training steps run, echoed from the request. |
`learning_rate` | `number` | Learning rate used, echoed from the request. |
`failed`
Type: `array of object`
Denoisers whose training did not complete. Empty when every requested denoiser succeeded. Each entry has `denoiser` and a short `error` message. Because successful adapters are still returned, check this list to see which stages need to be retried. If every requested denoiser fails, the request itself fails.
`preprocessed_data_file`
Type: `file`
Reusable TRELLIS.2 preprocessed dataset zip used for training.
For a first pass with raw assets, this is the processed dataset created from your input archive. Save this URL and reuse it as `data_url` when training additional LoRA types from the same data.
For a run that already used a preprocessed dataset, this output points to the reusable dataset for that run.
No validation previews, metrics, or comparison panels are returned by this endpoint.
Example output for a run that trained two stages:
json{ "adapters": [ { "denoiser": "geometry", "lora_file": { "url": "https://example.com/geometry_lora.safetensors" }, "rank": 32, "training_steps": 1000, "learning_rate": 0.0001 }, { "denoiser": "texture", "lora_file": { "url": "https://example.com/texture_lora.safetensors" }, "rank": 32, "training_steps": 1000, "learning_rate": 0.0001 } ], "failed": [], "preprocessed_data_file": { "url": "https://example.com/trellis_2_preprocessed_dataset.zip" } }
How the Training Works
Pipeline Overview
- Dataset detection - the trainer downloads and extracts the
`.zip`. If it finds a valid TRELLIS.2 preprocessed dataset, it reuses it. Otherwise, it treats the archive as raw 3D assets. - Preprocessing - raw 3D assets are converted into a TRELLIS.2 training dataset at the requested
`resolution`(512 or 1024), once per request regardless of how many stages are requested. - Training - the trainer validates the files needed for each selected
`denoisers`entry and trains one LoRA adapter per stage, all in parallel. - Output - the trained
`.safetensors`adapters and the reusable preprocessed dataset zip are uploaded and returned.
What Happens to Raw Assets During Preprocessing
Archive extraction: The `.zip` is unpacked safely. Nested folders are allowed. Hidden files and macOS metadata folders are ignored during asset discovery.
Asset discovery: Supported 3D asset files are sorted and registered in `metadata.csv`. Each asset is identified by a content hash so the processed files can be matched back to the source object.
Mesh processing: Blender loads each supported asset, extracts mesh geometry, normalizes the object into a common coordinate space, and converts the mesh into TRELLIS.2-ready geometry data.
Material processing: When material data is present, Blender extracts PBR-style appearance information such as color and material attributes so the texture LoRA can learn surface appearance.
Condition-view rendering: The processor renders 16 image-condition views per object. These rendered views are used during training so the LoRA learns from image-conditioned examples, matching how `trellis-2-lora` is used at inference time.
Model-ready training features: The processor writes the feature files needed by each LoRA stage:
- sparse-structure training data
- geometry training data
- texture/material training data
- rendered condition views
- dataset metadata and summary files
The returned `preprocessed_data_file` contains these processed files, including `metadata.csv` and `prepared_dataset.json`.
Reusing the Preprocessed Dataset
The first raw-asset run is both a preprocessing pass and a training run. The returned `preprocessed_data_file` is the key artifact for efficient follow-up training.
Example first pass (train all three stages at once):
json{ "data_url": "https://example.com/my_3d_assets.zip", "denoisers": ["sparse_structure", "geometry", "texture"], "rank": 32, "learning_rate": 0.0001, "training_steps": 1000 }
Example follow-up run that trains an additional stage from the returned preprocessed dataset:
json{ "data_url": "https://example.com/trellis_2_preprocessed_dataset.zip", "denoisers": ["geometry"], "rank": 32, "learning_rate": 0.0001, "training_steps": 1000 }
You can train any subset of `sparse_structure`, `geometry`, and `texture` in a single request, or split them across requests that reuse the same `preprocessed_data_file`.
The Three LoRA Types
Sparse-Structure LoRA
Train with:
json{ "data_url": "https://example.com/trellis_2_preprocessed_dataset.zip", "denoisers": ["sparse_structure"] }
Use when the broad shape, object category, or silhouette should be customized. This is the earliest stage, so it can strongly influence whether the generated model has the right overall proportions.
At inference, pass the adapter's `lora_file.url` as `sparse_structure_lora_url`.
Geometry LoRA
Train with:
json{ "data_url": "https://example.com/trellis_2_preprocessed_dataset.zip", "denoisers": ["geometry"] }
Use when the important change is detailed 3D form: structural traits, surface relief, object-specific geometry, or recurring shape details.
At inference, pass the adapter's `lora_file.url` as `geometry_lora_url`.
Texture LoRA
Train with:
json{ "data_url": "https://example.com/trellis_2_preprocessed_dataset.zip", "denoisers": ["texture"] }
Use when the important change is appearance: color palette, material finish, product texture, stylized surface treatment, or branded visual details.
At inference, pass the adapter's `lora_file.url` as `texture_lora_url`.
Using Trained LoRAs During Inference
You can use trained adapters independently or together with `trellis-2-lora`. Match each adapter's `denoiser` to the inference field of the same name.
Texture only:
json{ "image_url": "https://example.com/input.png", "texture_lora_url": "https://example.com/texture_lora.safetensors" }
Geometry and texture:
json{ "image_url": "https://example.com/input.png", "geometry_lora_url": "https://example.com/geometry_lora.safetensors", "texture_lora_url": "https://example.com/texture_lora.safetensors" }
Full stack:
json{ "image_url": "https://example.com/input.png", "sparse_structure_lora_url": "https://example.com/sparse_structure_lora.safetensors", "geometry_lora_url": "https://example.com/geometry_lora.safetensors", "texture_lora_url": "https://example.com/texture_lora.safetensors" }
If you trained all three from the same `preprocessed_data_file` (or in one multi-stage request), they are usually easier to combine because every adapter learned from the same processed objects and condition views.
Recommended Starting Configuration
Train the full adapter set in a single request:
json{ "data_url": "https://example.com/my_3d_assets.zip", "denoisers": ["sparse_structure", "geometry", "texture"], "rank": 32, "learning_rate": 0.0001, "training_steps": 1000 }
Keep `rank`, `learning_rate`, and `training_steps` the same across stages at first so you can compare the effect of each stage cleanly. If you prefer to iterate on one stage at a time, run single-entry `denoisers` requests that reuse the first run's `preprocessed_data_file.url`.
Tips for Good Results
- Use clean assets that represent the same product family, object type, or visual style you want the adapter to learn.
- Remove broken files, empty meshes, and unrelated objects from the archive.
- For texture LoRAs, include assets with useful material and texture information. Geometry-only files with little appearance data will not teach strong texture behavior.
- For sparse-structure and geometry LoRAs, prioritize complete meshes with reliable proportions and shape detail.
- If outputs barely change at inference, try more training steps or a higher rank.
- If outputs copy the training set too aggressively, try fewer training steps, a lower rank, or a more varied dataset.
- Use the same inference image and seed when comparing adapters trained with different settings.
- Each returned adapter carries its own
`denoiser`; the`lora_file`must be used in the matching inference field.
Common Pitfalls
- Training only
`texture`and expecting a new silhouette or object structure. - Passing a
`texture`LoRA to`geometry_lora_url`, or any other mismatched inference field. - Re-uploading raw assets for every LoRA type instead of reusing
`preprocessed_data_file`(or training the stages together in one request). - Mixing raw assets and a preprocessed dataset in the same archive.
- Expecting prompt captions or trigger phrases to affect training. This endpoint does not expose text-caption training controls.
- Training texture adapters from assets that have missing or low-quality material information.
Billing
Quick look: you pay a one-time dataset-preparation charge of $0.10023 per input object (minimum 10 objects), plus a training charge for each denoiser you train of about $8.34 / $3.25 / $3.38 per 1,000 steps for `sparse_structure` / `geometry` / `texture` — then multiplied by your `rank` (×0.5 / ×1 / ×2 for 16 / 32 / 64) and `resolution` (×1 / ×2 for 512 / 1024). Reusing a prepared dataset removes the preparation charge, and only denoisers that finish successfully are billed. Example: preparing 20 objects and training all three denoisers for 2,000 steps each at rank 32 / resolution 512 is about $31.93 (the same job at rank 64 / resolution 1024 is about $121.69).
Your total is the dataset-preparation cost plus the training cost of every denoiser you request:
texttotal = preparation + training
Dataset preparation (charged once per request)
textpreparation = max(10, number_of_objects) × $0.10023
- Billed on the number of 3D objects in your dataset, with a 10-object minimum — a 4-object dataset is billed as 10 objects (= $1.0023).
- $0.00 when you pass an already-prepared
`preprocessed_data_file`, because no preparation is redone.
Training (charged per denoiser)
For each denoiser you train:
texttraining_denoiser = training_steps × rate_per_step × rank_multiplier × resolution_multiplier
Base rate per step (at rank 32, resolution 512):
| Denoiser | Rate per step | Per 1,000 steps |
|---|---|---|
`sparse_structure` | $0.008337 | $8.337 |
`geometry` | $0.003246 | $3.246 |
`texture` | $0.003378 | $3.378 |
The training cost is then scaled by a rank multiplier and a resolution multiplier:
`rank` | Multiplier |
|---|---|
| 16 | ×0.5 |
| 32 | ×1.0 |
| 64 | ×2.0 |
`resolution` | Multiplier |
|---|---|
| 512 | ×1.0 |
| 1024 | ×2.0 |
- The multipliers stack: training at rank 64 and resolution 1024 costs ×4.0 the base rate.
- A request with several denoisers sums their individual costs;
`sparse_structure`is the most expensive per step (~2.5×`texture`). `training_steps`ranges 100–10,000, and only denoisers that complete successfully are billed — any listed under`failed`are not charged.
Example
Prepare 20 objects, then train all three denoisers for 2,000 steps each at rank 32 and resolution 512 (both multipliers ×1.0):
| Item | Calculation | Cost |
|---|---|---|
| Preparation | 20 × $0.10023 | $2.0046 |
`sparse_structure` | 2,000 × $0.008337 × 1.0 × 1.0 | $16.6740 |
`geometry` | 2,000 × $0.003246 × 1.0 × 1.0 | $6.4920 |
`texture` | 2,000 × $0.003378 × 1.0 × 1.0 | $6.7560 |
| Total | $31.93 |
The same job at rank 64 and resolution 1024 scales each training row by ×4.0 — training becomes $119.69 and the total is about $121.69.