Try New Grok Imagine here!

PhotoMaker Image to Image

fal-ai/photomaker
Customizing Realistic Human Photos via Stacked ID Embedding
Inference
Commercial use

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Download

Your request will cost $0 per compute second.

Logs

Photomaker | [text/image-to-image]

PhotoMaker's stacked ID embedding architecture delivers personalized character generation. Trading freeform creative flexibility for identity consistency, the model accepts up to 4 reference images via ZIP archive to maintain facial features across different scenes and styles. Built for creators who need reliable character consistency across multiple images without manual prompt engineering.

Use Cases: Portrait Photography Variations | Character Consistency for Content Creation | Brand Ambassador Image Generation


Performance

PhotoMaker operates at $0.001 per compute second, making it accessible for high-volume character generation workflows where identity preservation matters more than generation speed.

MetricResultContext
Resolution1024x1024Standard square format for social media and portrait applications
Inference Steps20-100 (default 50)Configurable quality-speed tradeoff
Cost per Image$0 per compute secondVolume-friendly pricing for batch generation
Input CapacityUp to 4 reference imagesZIP archive format for multi-image identity learning
Batch Generation1-4 images per requestSingle API call for multiple variations

Identity Preservation Through Multi-Image Learning

PhotoMaker uses stacked ID embedding to learn facial characteristics from multiple reference images simultaneously, contrasting with single-image conditioning approaches that struggle with angle variations and lighting changes.

What this means for you:

  • Consistent Character Generation: Upload 3-4 reference photos in a ZIP archive and generate unlimited variations maintaining the same facial features across different prompts, styles, and scenarios

  • Style Flexibility: 11 preset styles from Photographic to Comic Book to Neonpunk, each applying while preserving core identity features from your reference images

  • Img2Img Workflow Support: Optional initial image input with adjustable strength (0-1 range) lets you guide composition while maintaining identity consistency from reference archive

  • Fine-Grained Control: Guidance scale (0.1-10), style strength (15-50), and inference steps (20-100) provide granular control over how closely the output matches your prompt versus preserving reference identity


Technical Specifications

SpecDetails
ArchitecturePhotoMaker Stacked ID Embedding
Input FormatsZIP archive of reference images, optional initial image URL
Output FormatsPNG (1024x1024)
Reference CapacityMultiple images per ZIP archive
LicenseCommercial use permitted

API Documentation | Quickstart Guide | Enterprise Pricing


How It Stacks Up

FASHN Virtual Try-On V1.5 – PhotoMaker prioritizes identity consistency across multiple generated images through stacked embedding architecture, ideal for content creators building character libraries. FASHN Virtual Try-On focuses specifically on clothing visualization workflows, trading character flexibility for garment-specific realism and fit accuracy in fashion e-commerce applications.