Vidu Text to Image
Input
Customize your input with more control.
Vidu Q2 | [text-to-image]
Vidu's Q2 text-to-image model delivers single-frame image generation at $0.10 per image with support for three aspect ratios. Trading video generation capabilities for simplified image output, this endpoint focuses on static visual creation from text prompts up to 1500 characters. Built for rapid prototyping teams needing cost-predictable image generation without video overhead.
Use Cases: Marketing asset creation | Concept visualization | Static content workflows
Performance
At $0.10 per image, Vidu Q2 positions as a mid-tier image generation option, trading advanced features for straightforward execution.
| Metric | Result | Context |
|---|---|---|
| Prompt Length | 1500 characters max | Sufficient for detailed scene descriptions |
| Aspect Ratios | 3 presets (16:9, 9:16, 1:1) | Standard social/web formats |
| Cost per Image | $0.10 | 10 generations per $1.00 on fal |
| Output Format | PNG via URL | Single static image delivery |
Text-to-Image Without the Complexity
Vidu's architecture strips video generation down to single-frame output, processing text prompts through a simplified inference path that outputs static images in 16:9, 9:16, or 1:1 aspect ratios.
What this means for you:
- Straightforward prompt handling: Process up to 1500 characters of descriptive text without multi-modal complexity or reference image requirements
- Fixed aspect ratio control: Three preset ratios eliminate resolution guesswork for social media, web, and square formats
- Deterministic generation: Optional seed parameter enables reproducible outputs for iterative refinement workflows
- Single-output simplicity: Returns one image per request with predictable PNG format output via URL delivery
Technical Specifications
| Spec | Details |
|---|---|
| Architecture | Vidu Q2 text-to-image |
| Input Formats | Text prompt (string, max 1500 chars) |
| Output Formats | PNG image (URL delivery) |
| Aspect Ratio Options | 16:9, 9:16, 1:1 |
| License | Commercial use permitted |
API Documentation | Quickstart Guide | Enterprise Pricing
How It Stacks Up
Bytedance Seedream v4.5 Text to Image ($0.039) – Vidu Q2 trades cost efficiency for simplified implementation at 2.5x the price. Bytedance's Seedream v4.5 architecture delivers comparable text-to-image capabilities with more aggressive pricing for high-volume workflows where per-image cost matters.
Hunyuan Image v3 Text to Image ($0.039) – Vidu's $0.10 pricing reflects 2.5x higher cost versus Hunyuan's v3 endpoint, which offers similar prompt-to-image generation with tighter cost control. Hunyuan prioritizes volume economics for production pipelines requiring thousands of daily generations.
Recraft V3 Text to Image ($0.039) – At $0.10 versus Recraft's $0.039, Vidu trades 2.5x cost savings for alternative architectural approach. Recraft V3 emphasizes cost efficiency and performance optimization for teams prioritizing budget predictability over specific model characteristics.
