Wan 2.6 introduces reference-to-video generation, expanded aspect ratios (16:9, 9:16, 1:1, 4:3, 3:4), enhanced multi-shot capabilities, and longer duration support up to 15 seconds—making it a substantial upgrade over Wan 2.5 for creators requiring character consistency and cross-platform content.
What Changed with Wan 2.6
Wan 2.5 established the foundation with native audio generation capabilities1, but Wan 2.6 expands the model's practical utility across three core generation paths: text-to-video with enhanced prompt handling and multi-shot segmentation, image-to-video with improved motion coherence, and the new reference-to-video path for subject consistency.
The version increment addresses specific production constraints:
- Limited aspect ratio support
- Inconsistent character identity across scenes
- Duration caps that restricted narrative complexity.
Text-to-Video: Resolution and Format Options
Aspect Ratio Support
Wan 2.6 expands aspect ratio coverage to match platform-specific requirements:
| Feature | Wan 2.5 | Wan 2.6 |
|---|---|---|
| Aspect Ratios | 16:9, 9:16 | 16:9, 9:16, 1:1, 4:3, 3:4 |
| Resolutions | 720p, 1080p | 720p, 1080p |
| Max Duration | 10s | 15s |
The expanded options eliminate post-generation cropping when targeting YouTube (16:9), Instagram Reels (9:16), or square social formats (1:1).
Multi-Shot Narrative Control
Wan 2.6's multi-shot system uses structured prompt syntax for scene timing:
Overall description. Shot 1 [0-3s] content. Shot 2 [3-5s] content.
The multi_shots parameter (enabled by default when prompt expansion is active) processes these segments with proper transitions. This matters for commercial work requiring precise timing, particularly when coordinating with external audio tracks.
Prompt Expansion via LLM
Both versions include LLM-based prompt expansion, but Wan 2.6's implementation better preserves narrative context across shot transitions, reducing manual prompt engineering for multi-scene sequences.
falMODEL APIs
The fastest, cheapest and most reliable way to run genAI models. 1 API, 100s of models
Image-to-Video: Duration and Scene Complexity
Wan 2.6 supports 5, 10, and 15-second clips in image-to-video mode. Wan 2.5 capped at 10 seconds. The additional duration allows more complex visual narratives from single source images.
Wan 2.5 generated single continuous shots from images. Wan 2.6 transforms a single image into multi-scene narratives with proper transitions when using prompt expansion and the multi_shots parameter.
Reference-to-Video: Subject Consistency
Reference-to-video addresses character identity persistence across generated scenes. The system accepts 1-3 reference videos, referenced in prompts using @Video1, @Video2, and @Video3 syntax.
The feature works for people, animals, and objects. A prompt like "Dance battle between @Video1 and @Video2" maintains each subject's identity throughout the generated video.
Current limitations:
- Only supports 5 and 10-second durations (no 15-second option)
- Requires publicly accessible video URLs
- Subject consistency depends on reference video quality
Audio Integration
Both versions support:
- External audio via URL (WAV/MP3, 3-30 seconds, up to 15MB)
- Automatic audio trimming to match video duration
- Native audio generation with synchronized dialogue (introduced in Wan 2.5)
Wan 2.6 maintains these capabilities while ensuring compatibility with longer durations and multi-shot sequences.
Performance Characteristics
Generation speed varies based on queue depth, system load, and complexity of the requested output. Both versions process requests through fal's infrastructure with comparable performance profiles for standard generation tasks.
Wan 2.6 demonstrates improved handling of multi-shot prompts and scene transitions, resulting in fewer failed generations when processing complex narrative structures.
Both versions include safety checkers (enabled by default) to prevent inappropriate content generation.
Production Use Cases
Wan 2.6 provides specific value for:
Cross-platform content strategies: Expanded aspect ratios eliminate multiple generation passes for different platforms.
Narrative projects: Multi-shot capabilities support more sophisticated storytelling without external editing tools.
Character-based content: Reference-to-video ensures identity consistency across scenes.
Extended sequences: 15-second duration support accommodates longer narrative arcs.
Known Constraints
Despite improvements, limitations remain:
- Reference-to-video excludes 15-second duration
- Text-to-video minimum resolution is 720p (no 480p option)
- Maximum prompt length: 800 characters
- Multi-shot timing depends on prompt expansion quality
Migration Considerations
Wan 2.6 represents a substantial upgrade if your workflows require:
- Multiple aspect ratios for platform-specific content
- Narrative sequences with distinct scenes
- Character consistency across generated videos
- Duration support beyond 10 seconds
Existing Wan 2.5 implementations may continue to function adequately for simpler single-shot generation or workflows already optimized around current limitations.
API compatibility note: Both versions use similar parameter structures, but Wan 2.6 adds reference-to-video as a separate endpoint. Text-to-video and image-to-video migrations require only endpoint updates and optional parameter adjustments for new capabilities.
Technical Assessment
The comparison reveals architectural improvements beyond incremental parameter tuning. Wan 2.5 established native audio and quality generation. Wan 2.6 expands creative control through reference-based generation, enhanced multi-shot capabilities, and flexible format support that addresses real production constraints in multi-platform content creation.
Recently Added
References
-
fal.ai. "Wan 2.5 Preview is Now Available on fal." fal.ai, 2025. https://blog.fal.ai/wan-2-5-preview-is-now-available-on-fal/ ↩

















![FLUX.2 [max] delivers state-of-the-art image generation and advanced image editing with exceptional realism, precision, and consistency.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a868a0f%2FzL7LNUIqnPPhZNy_PtHJq_330f66115240460788092cb9523b6aba.jpg&w=3840&q=75)
![FLUX.2 [max] delivers state-of-the-art image generation and advanced image editing with exceptional realism, precision, and consistency.](/_next/image?url=https%3A%2F%2Fv3b.fal.media%2Ffiles%2Fb%2F0a8689a8%2Fbbcmo6U5xg_RxDXijtxNA_55df705e1b1b4535a90bccd70887680e.jpg&w=3840&q=75)




