Wan 2.6Star in Your Own AI Video
Your Character. Your Voice. Your Story.
Be the Star of Your Video
Upload a reference video of yourself, capturing appearance and voice, and generate new scenes starring you. Wan 2.6's reference-to-video mode captures both visual identity and vocal characteristics for consistent character performance across generations.
Cinematic Multi-Shot Narratives
Build structured multi-shot videos with scene continuity, stable characters, and controlled camera movement. Generate up to 15 seconds of HD video with natural pacing that feels deliberate, not fragmented.
Natural Dialogue & Sound
Produce stable multi-person dialogue scenes with natural voice expression, accurate lip sync, and enhanced music generation. Supports both Chinese and English prompts for global creators.
Video and image generation
Generate videos from text, images, or reference clips. Create and edit images with reference consistency.
See what Wan 2.6 can create
Copy any prompt below and try it yourself in the playground.
"A lone figure stands on an arctic ridge as the camera pulls back to reveal the Northern Lights dancing across the sky above jagged icebergs"
"A couple walks along a beach at sunset, waves crashing, one turns to the other and says 'this is exactly where I want to be', wind in their hair, natural light"
"A street vendor calls out to passersby in a busy market, sounds of crowds, sizzling food, and distant music, handheld camera"
"Aerial shot of a fishing boat on calm ocean waters at sunrise, ambient ocean sounds"
A few lines of code.
Cinematic video.
fal.ai handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPUs to manage.
- Serverless: scales to zero, scales to millions
- Pay per second, no minimums
- Python and JavaScript SDKs, plus REST API
import fal_client
result = fal_client.run(
"wan/v2.6/text-to-video",
arguments={
"prompt": "A woman walks through neon-lit Tokyo streets",
"resolution": "1080p",
"duration": "10s",
}
)
# result.video.url → your generated videoCommon questions about Wan 2.6
What can I create with Wan 2.6?
Wan 2.6 supports text-to-video, image-to-video, reference-to-video (with character and voice consistency), text-to-image, and image editing. Video output supports 720p and 1080p at durations of 5, 10, or 15 seconds. Multiple aspect ratios including 16:9, 9:16, 1:1, 4:3, and 3:4.
How does reference-to-video work?
Upload a 3-8 second reference video of a character. Wan 2.6 extracts appearance and voice, then generates new scenes maintaining both visual and audio consistency. You can use up to 3 reference videos per generation. A Flash variant is available for faster results.
Does Wan 2.6 support image generation?
Yes. Wan 2.6 includes text-to-image and image editing endpoints. Text-to-image supports generating up to 5 images per request with optional reference images. Image editing supports style transfer and subject consistency using 1-3 reference images.
What languages does Wan 2.6 support?
Prompts support both Chinese and English with up to 2,000 characters. Audio generation supports natural dialogue in both languages.
How much does Wan 2.6 cost on fal.ai?
Video generation: $0.10/s at 720p, $0.15/s at 1080p. Flash variants from $0.05/s. A 10-second 1080p video costs $1.50 (standard) or $0.75 (Flash). Image generation: $0.03/image. Pay-per-use with no minimums or subscriptions.
How do I get started with the API?
Install the fal.ai SDK (Python or JavaScript), grab an API key from your dashboard, and make your first request in a few lines of code. The API is serverless, so no GPUs to manage, no infrastructure to set up. Check the API documentation for your chosen endpoint to see all available parameters.
Can I use Wan 2.6 for commercial projects?
Yes. Content generated through the fal.ai API can be used in commercial projects. Check fal.ai's terms of service for full details on usage rights and licensing.
Ready to create?
Start generating HD AI video with Wan 2.6 on fal.ai.

