PhotoMaker Image to Image
Input
Hint: Drag and drop files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL.
Customize your input with more control.
Logs
Photomaker | [text/image-to-image]
PhotoMaker's stacked ID embedding architecture delivers personalized character generation. Trading freeform creative flexibility for identity consistency, the model accepts up to 4 reference images via ZIP archive to maintain facial features across different scenes and styles. Built for creators who need reliable character consistency across multiple images without manual prompt engineering.
Use Cases: Portrait Photography Variations | Character Consistency for Content Creation | Brand Ambassador Image Generation
Performance
PhotoMaker operates at $0.001 per compute second, making it accessible for high-volume character generation workflows where identity preservation matters more than generation speed.
| Metric | Result | Context |
|---|---|---|
| Resolution | 1024x1024 | Standard square format for social media and portrait applications |
| Inference Steps | 20-100 (default 50) | Configurable quality-speed tradeoff |
| Cost per Image | $0 per compute second | Volume-friendly pricing for batch generation |
| Input Capacity | Up to 4 reference images | ZIP archive format for multi-image identity learning |
| Batch Generation | 1-4 images per request | Single API call for multiple variations |
Identity Preservation Through Multi-Image Learning
PhotoMaker uses stacked ID embedding to learn facial characteristics from multiple reference images simultaneously, contrasting with single-image conditioning approaches that struggle with angle variations and lighting changes.
What this means for you:
-
Consistent Character Generation: Upload 3-4 reference photos in a ZIP archive and generate unlimited variations maintaining the same facial features across different prompts, styles, and scenarios
-
Style Flexibility: 11 preset styles from Photographic to Comic Book to Neonpunk, each applying while preserving core identity features from your reference images
-
Img2Img Workflow Support: Optional initial image input with adjustable strength (0-1 range) lets you guide composition while maintaining identity consistency from reference archive
-
Fine-Grained Control: Guidance scale (0.1-10), style strength (15-50), and inference steps (20-100) provide granular control over how closely the output matches your prompt versus preserving reference identity
Technical Specifications
| Spec | Details |
|---|---|
| Architecture | PhotoMaker Stacked ID Embedding |
| Input Formats | ZIP archive of reference images, optional initial image URL |
| Output Formats | PNG (1024x1024) |
| Reference Capacity | Multiple images per ZIP archive |
| License | Commercial use permitted |
API Documentation | Quickstart Guide | Enterprise Pricing
How It Stacks Up
FASHN Virtual Try-On V1.5 – PhotoMaker prioritizes identity consistency across multiple generated images through stacked embedding architecture, ideal for content creators building character libraries. FASHN Virtual Try-On focuses specifically on clothing visualization workflows, trading character flexibility for garment-specific realism and fit accuracy in fashion e-commerce applications.