Sam 3 Image to Image

fal-ai/sam-3/image-rle
SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.
Inference
Commercial use

Input

Additional Settings

Customize your input with more control.

Result

Idle

Waiting for your input...

What would you like to do next?

Your request will cost $0.005 per unit.

Logs

SAM 3 Image RLE | [image-to-3d]

Meta's SAM 3D delivers state-of-the-art 3D reconstruction from single images at $0.005 per generation. Trading traditional multi-view capture workflows for instant single-image processing, it generates detailed 3D meshes and human body models in under a second. Built for developers shipping AR/VR experiences, e-commerce product visualization, and character animation pipelines where speed and cost efficiency matter.

Use Cases: Product Visualization | Character Rigging | AR/VR Asset Creation


Performance

At $0.005 per unit, SAM 3D operates 8-30x more cost-effectively than alternatives while maintaining production-ready quality for both object reconstruction and human body estimation.

MetricResultContext
Inference Speed~0.5 secondsComparable to TripoSR, faster than multi-view alternatives
Cost per Generation$0.005200 generations per $1.00 on fal
Input RequirementsSingle imageEliminates need for multi-view capture or depth sensors
Related EndpointsSAM 3D Objects, SAM 3D Body, SAM 3 ImageObjects vs Body vs Segmentation variants for different reconstruction needs

From Segmentation to 3D Reconstruction

SAM 3D extends Meta's Segment Anything foundation with unified 3D reconstruction capabilities. Where traditional photogrammetry demands dozens of images and minutes of processing, SAM 3D generates textured meshes from a single input through learned priors about object geometry and human anatomy.

What this means for you:

  • Instant Asset Generation: Single-image input eliminates complex capture rigs, upload a product photo and receive animation-ready 3D models in under a second for rapid prototyping workflows

  • Dual Specialization: Separate SAM 3D Objects and SAM 3D Body models optimize for either general object reconstruction or human body/shape estimation with SMPL-compatible output

  • Production-Ready Output: Generates textured meshes compatible with standard 3D pipelines, eliminating manual cleanup for Unity/Unreal Engine integration

  • Cost-Efficient Scaling: At $0.005 per generation versus $0.04-0.15+ for alternatives, process thousands of assets for e-commerce catalogs or game development at fraction of traditional costs


Technical Specifications

SpecDetails
ArchitectureSAM 3D
Input FormatsSingle RGB image (JPEG, PNG, WebP)
Output Formats3D meshes (textured), SMPL body models (SAM 3D Body variant)
Reconstruction TypeObject geometry + Human body/shape estimation
LicenseCommercial use enabled

API Documentation | Quickstart Guide | Enterprise Pricing


How It Stacks Up

SAM 3D Objects – The SAM 3D RLE endpoint prioritizes segmentation workflows with Run Length Encoding output for mask-based applications at identical cost. SAM 3D Objects specializes in full 3D object reconstruction with textured mesh output for asset generation pipelines.

SAM 3 Image – SAM 3D's reconstruction endpoint trades 2D segmentation for full 3D geometry generation at the same per-inference cost. The image-to-image variant remains ideal for mask generation, video frame segmentation, and annotation workflows where 2D outputs suffice.

SAM 3D Body – SAM 3D emphasizes human body reconstruction through its specialized Body variant with SMPL compatibility for character animation. This endpoint prioritizes anatomically accurate human models for gaming, virtual try-on, and motion capture applications at identical pricing.

Tripo3D Image to 3D – SAM 3D delivers comparable sub-second reconstruction speeds while offering dual specialization through Objects and Body variants. Tripo3D prioritizes general object reconstruction with emphasis on texture quality and geometric detail for product visualization at competitive speeds.