Fooocus Text to Image
Input
Customize your input with more control.
Result
What would you like to do next?
Your request will cost $0 per compute second.
Logs
Fooocus | [text-to-image]
Fooocus delivers automated SDXL optimization with built-in quality enhancements. Built with intelligent defaults, this text-to-image model targets developers who need production-ready outputs without extensive prompt engineering. Built for rapid iteration cycles where consistent quality matters more than granular control.
Use Cases: Marketing Asset Generation | Product Visualization | Concept Art Development
Performance
Fooocus eliminates cost barriers for high-volume image generation workflows while maintaining SDXL-level quality through automated optimization layers.
| Metric | Result | Context |
|---|---|---|
| Resolution | Up to 1024x1024 | Configurable aspect ratios with 8-pixel alignment requirement |
| Batch Generation | 1-4 images per request | Parallel generation within single API call |
| Cost per Image | $0 | Preview pricing, production rates TBD |
| Performance Modes | 4 speed tiers | Extreme Speed, Lightning, Speed, Quality presets |
| Related Endpoints | Image Prompt, Upscale or Vary | Reference-guided generation and resolution enhancement variants |
Automated Optimization Without Parameter Overhead
Fooocus wraps SDXL architecture with pre-configured enhancement layers, eliminating the typical parameter tuning required for production-quality outputs. Where standard SDXL implementations require manual CFG scale adjustment, negative prompt crafting, and style preset selection, Fooocus ships with battle-tested defaults.
What this means for you:
-
Intelligent Style Layering: Fooocus Enhance + V2 + Sharp styles stack automatically, delivering detail preservation without manual LoRA weight balancing
-
Control Image Integration: Four control modes (ImagePrompt, PyraCanny, CPDS, FaceSwap) via single parameter, no separate preprocessing pipelines required
-
LoRA Merging: Up to 5 LoRA models combine in a single request with weight control, replacing multi-step generation workflows
-
Performance Scaling: Four speed presets (Extreme Speed through Quality) trade inference time for detail density based on use case priority
Technical Specifications
| Spec | Details |
|---|---|
| Architecture | Stable Diffusion XL |
| Input Formats | Text prompts, control images (URL), LoRA models (up to 5) |
| Output Formats | PNG, JPEG, WebP |
| Resolution Range | Custom dimensions (8-pixel multiples) with 1024x1024 default |
| License | Commercial use enabled |
API Documentation | Quickstart Guide | Enterprise Pricing
How It Stacks Up
Fooocus Image Prompt – The Image Prompt variant extends base Fooocus with reference image conditioning for style transfer workflows. Base Fooocus handles pure text-to-image generation where reference images aren't required, eliminating the image preprocessing step.
Fooocus Upscale or Vary – The Upscale or Vary endpoint trades generation flexibility for resolution enhancement and variation control. Base Fooocus remains ideal for initial generation where upscaling isn't part of the immediate workflow, reducing API call overhead.