✨ NEW: Seedream 4.0 Edit - Best Model for Image Editing

Wan-2.2 Speech-to-Video 14B Audio to Video

fal-ai/wan/v2.2-14b/speech-to-video
Wan-S2V is a video model that generates high-quality videos from static images and audio, with realistic facial expressions, body movements, and professional camera work for film and television applications
Inference
Commercial use

Input

Additional Settings

Customize your input with more control.

Result

Idle
This generation takes approximately 5m.

What would you like to do next?

Your request will cost $0.20 per video second for 720p, $0.15 per video second for 580p, $0.10 per video second for 480p.

Logs