Trellis 2 Image to 3D Parameter Guide

Explore all models

Trellis 2's 3D model quality depends on parameter configuration. This guide covers the two-stage generation pipeline, verified API parameters, and input image optimization.

last updated
12/20/2025
edited by
Zachary Roth
read time
4 minutes
Trellis 2 Image to 3D Parameter Guide

Configuring Trellis 2 for Production 3D Assets

Trellis 2 on fal transforms single images into production-ready 3D models through a parameter-driven pipeline. Unlike text-to-image systems, Trellis 2 requires no prompts. The model interprets visual information from your input image and reconstructs geometry and texture through a two-stage generation process.

The underlying architecture employs Structured LATent (SLAT) representation, a unified framework that encodes both geometric structure and surface appearance in a sparse 3D grid.1 This representation enables the model to capture structural geometry and textural appearance while maintaining flexibility during decoding. Understanding how each parameter influences output quality allows you to optimize for specific use cases rather than accepting default behavior.

Understanding the Two-Stage Pipeline

Trellis 2 processes images through two sequential generation stages, each with configurable guidance and sampling parameters:

Sparse Structure (SS) Generation establishes the initial 3D structure from your 2D input. The model analyzes the image to create a sparse voxel grid representing the object's basic geometry. Parameters at this stage control how strictly the 3D structure adheres to the input image.

Structured Latent (SLAT) Generation refines the sparse structure by adding geometric detail and surface appearance. This stage populates the voxel grid with latent vectors that encode both shape refinement and texture information. The SLAT representation integrates features extracted from a vision foundation model (DINOv2), which enables the system to infer occluded geometry and surface properties not directly visible in the input image.1

The output is a GLB file containing mesh geometry and baked textures, ready for use in standard 3D applications including Three.js, Unity, Unreal Engine, and Blender. The GLB format ensures broad compatibility without requiring format conversion.

falMODEL APIs

The fastest, cheapest and most reliable way to run genAI models. 1 API, 100s of models

falSERVERLESS

Scale custom models and apps to thousands of GPUs instantly

falCOMPUTE

A fully controlled GPU cloud for enterprise AI training + research

API Parameters

The fal Trellis API exposes the following parameters:

ParameterTypeDefaultDescription
image_urlstringrequiredURL of the input image
seedintegerrandomRandom seed for reproducibility
ss_guidance_strengthfloat7.5Guidance strength for sparse structure generation
ss_sampling_stepsinteger12Sampling steps for sparse structure generation
slat_guidance_strengthfloat3.0Guidance strength for structured latent generation
slat_sampling_stepsinteger12Sampling steps for structured latent generation
mesh_simplifyfloat0.95Mesh simplification factor
texture_sizeenum1024Texture resolution (512, 1024, or 2048)

Resolution tiers affect both output quality and pricing: 512p ($0.25), 1024p ($0.30), and 1536p ($0.35) per generation.

Parameter Tuning Guidelines

Guidance Strength

The two guidance parameters control how closely each stage adheres to its inputs:

SS Guidance Strength (default: 7.5) affects how strictly the initial 3D structure follows the input image. Higher values enforce closer adherence to visible geometry, while lower values allow more interpretive freedom. For objects with clear, well-defined shapes, values of 7-9 work well. For ambiguous inputs, lower values (5-7) may produce more coherent results.

SLAT Guidance Strength (default: 3.0) controls refinement intensity during the second stage. This parameter has a lower default because excessive guidance at this stage can introduce artifacts. Increase cautiously to 4-5 for objects requiring fine geometric detail.

Sampling Steps

Each sampling step refines the generation, but with diminishing returns. The defaults (12 steps for each stage) balance quality and generation time. Increasing to 16-20 steps may improve complex geometry but extends processing time proportionally.

Mesh Simplification

The mesh_simplify parameter (default: 0.95) controls polygon reduction in the final mesh. Values closer to 1.0 preserve more geometric detail; lower values reduce polygon count for real-time applications. For game assets targeting mobile, consider values of 0.85-0.90. Desktop and console applications can typically use the default or higher.

Note that aggressive simplification (values below 0.8) may collapse fine details like fingers, facial features, or thin structural elements. Test with your specific asset types to find acceptable thresholds.

Texture Resolution

Available options are 512, 1024, and 2048 pixels. Higher resolutions capture finer surface detail but increase file size and generation time. For preview work, 512 suffices. Production assets typically use 1024 or 2048.

Input Image Preparation

Trellis 2 results correlate directly with input image quality. The model performs best with images optimized for 3D reconstruction.

Optimal Input Characteristics:

  • Clean, well-lit subjects against neutral or solid backgrounds
  • Objects centered in frame with minimal perspective distortion
  • Resolution of at least 512x512 pixels (1024x1024 recommended)
  • Clear separation between subject and background
  • Consistent lighting without harsh shadows or specular highlights

Problematic Input Patterns:

  • Cluttered backgrounds that confuse edge detection
  • Extreme perspective angles or fish-eye distortion
  • Low resolution or heavily compressed images
  • Subjects with transparent or highly reflective surfaces
  • Multiple overlapping objects in frame

For product photography, a simple white or light gray background improves results significantly. The model extracts depth cues from shading and edge information, so images with clear tonal separation produce more accurate geometry.

Basic API Usage

import fal_client

result = fal_client.subscribe(
    "fal-ai/trellis-2",
    arguments={
        "image_url": "https://example.com/product-image.png"
    }
)
print(result["model_glb"]["url"])

For production systems, use fal.queue.submit() with webhooks rather than blocking on subscribe(). See the fal queue documentation for implementation details.

Troubleshooting

Image Processing Errors:

  • Verify image URL is publicly accessible
  • Ensure image format is supported (PNG, JPG, WEBP)
  • Check image dimensions meet minimum requirements

Quality Issues:

  • Missing geometry typically indicates poor background separation in the input
  • Distorted meshes suggest perspective distortion in input image
  • Low-quality textures may result from insufficient texture_size parameter or low-resolution input images

Generation Failures:

  • Parameter values outside valid ranges cause validation errors
  • Complex scenes with multiple objects often produce inconsistent geometry

When to Adjust Defaults

The default parameters work well for most inputs. Consider adjustments in these scenarios:

  • Hard-surface objects (furniture, vehicles, architecture): Increase ss_guidance_strength to 8-9 for stricter geometric adherence
  • Organic forms (characters, plants, fabric): Keep guidance values at defaults or slightly lower
  • Rapid prototyping: Reduce texture_size to 512 and keep default sampling steps
  • Final production assets: Increase texture_size to 2048 and consider 16+ sampling steps

The seed parameter enables reproducibility. When you find settings that work well for a particular object type, record the seed value to generate consistent results across similar inputs.

For iterative workflows, generate an initial result with default parameters, then adjust one parameter at a time while keeping the seed constant. This isolates the effect of each change and helps you build intuition for how the model responds to different inputs.

Recently Added

References

  1. Xiang, J., et al. "Structured 3D Latents for Scalable and Versatile 3D Generation." arXiv:2412.01506, 2024. https://arxiv.org/abs/2412.01506 ↩ ↩2

about the author
Zachary Roth
A generative media engineer with a focus on growth, Zach has deep expertise in building RAG architecture for complex content systems.

Related articles