Vidu Text to Image

fal-ai/vidu/q2/text-to-image
Use vidu Text-to-Image to turn your prompts into reality.
Inference
Commercial use
Partner

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Your request will cost $0.1 per image.

Logs

Vidu Q2 | [text-to-image]

Vidu's Q2 text-to-image model delivers single-frame image generation at $0.10 per image with support for three aspect ratios. Trading video generation capabilities for simplified image output, this endpoint focuses on static visual creation from text prompts up to 1500 characters. Built for rapid prototyping teams needing cost-predictable image generation without video overhead.

Use Cases: Marketing asset creation | Concept visualization | Static content workflows


Performance

At $0.10 per image, Vidu Q2 positions as a mid-tier image generation option, trading advanced features for straightforward execution.

MetricResultContext
Prompt Length1500 characters maxSufficient for detailed scene descriptions
Aspect Ratios3 presets (16:9, 9:16, 1:1)Standard social/web formats
Cost per Image$0.1010 generations per $1.00 on fal
Output FormatPNG via URLSingle static image delivery

Text-to-Image Without the Complexity

Vidu's architecture strips video generation down to single-frame output, processing text prompts through a simplified inference path that outputs static images in 16:9, 9:16, or 1:1 aspect ratios.

What this means for you:

  • Straightforward prompt handling: Process up to 1500 characters of descriptive text without multi-modal complexity or reference image requirements
  • Fixed aspect ratio control: Three preset ratios eliminate resolution guesswork for social media, web, and square formats
  • Deterministic generation: Optional seed parameter enables reproducible outputs for iterative refinement workflows
  • Single-output simplicity: Returns one image per request with predictable PNG format output via URL delivery

Technical Specifications

SpecDetails
ArchitectureVidu Q2 text-to-image
Input FormatsText prompt (string, max 1500 chars)
Output FormatsPNG image (URL delivery)
Aspect Ratio Options16:9, 9:16, 1:1
LicenseCommercial use permitted

API Documentation | Quickstart Guide | Enterprise Pricing


How It Stacks Up

Bytedance Seedream v4.5 Text to Image ($0.039) – Vidu Q2 trades cost efficiency for simplified implementation at 2.5x the price. Bytedance's Seedream v4.5 architecture delivers comparable text-to-image capabilities with more aggressive pricing for high-volume workflows where per-image cost matters.

Hunyuan Image v3 Text to Image ($0.039) – Vidu's $0.10 pricing reflects 2.5x higher cost versus Hunyuan's v3 endpoint, which offers similar prompt-to-image generation with tighter cost control. Hunyuan prioritizes volume economics for production pipelines requiring thousands of daily generations.

Recraft V3 Text to Image ($0.039) – At $0.10 versus Recraft's $0.039, Vidu trades 2.5x cost savings for alternative architectural approach. Recraft V3 emphasizes cost efficiency and performance optimization for teams prioritizing budget predictability over specific model characteristics.