Veo 3.1Cinema-Quality Video. With Sound.
4K Video. Native Audio. Zero Compromise.
True 4K Output
The first mainstream AI video model to support true 4K resolution output. Generate at 720p, 1080p, or 4K with aspect ratios of 16:9 or 9:16. Every frame is sharp enough for professional delivery.
Synchronized Sound Design
Generate rich native audio alongside your video: natural dialogue with lip sync, ambient sound effects, and music. No post-production audio work needed.
One Model, Many Workflows
Text-to-video, image-to-video, first/last frame interpolation, reference-based generation, and video extension. Standard and Fast tiers for every mode give you the right speed-quality tradeoff.
Every mode, standard and fast
Generate video from text, images, frames, or references. Extend existing clips. Each mode has a fast variant for rapid iteration.
See what Veo 3.1 can create
Copy any prompt below and try it yourself in the playground.
"The white Lamborghini Countach drifts sharply around a corner and slides into a perfect park on a sunlit city street, smoke and tire screech filling the air, camera panning fast with cinematic motion blur, dust particles and heat haze, dynamic reflections on the car, hyper-realistic lighting, upbeat and energetic vibe."
"The man puts the net down as he turns and speaks to his apprentice, saying 'without patience, one cannot fish, and without fish, one will die' and then he smiles"
"The camera pans around the house, mysterious music playing"
"Slow drone shot around the colosseum as the naval battle takes place"
A few lines of code.
Cinema-quality video.
fal.ai handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPUs to manage.
- Serverless: scales to zero, scales to millions
- Pay per second, no minimums
- Python and JavaScript SDKs, plus REST API
import fal_client
result = fal_client.run(
"fal-ai/veo3.1",
arguments={
"prompt": "Cinematic drone shot over misty mountains",
"resolution": "1080p",
"audio": True,
}
)
# result.video.url → your generated videoCommon questions about Veo 3.1
What can I create with Veo 3.1?
Text-to-video, image-to-video, first/last frame interpolation, reference-based generation, and video extension. Supports 720p, 1080p, and 4K at 16:9 or 9:16. Videos up to 8 seconds per generation, extendable with the extend-video endpoint.
What's the difference between Standard and Fast?
Both support all modes. Standard delivers higher visual and audio quality. Fast is optimized for speed and iteration. Both are available for every endpoint variant.
How does native audio work?
Veo 3.1 generates synchronized audio alongside video, including dialogue with lip sync, sound effects, ambient noise, and music. You can enable or disable audio per request. Supports natural conversations in multiple languages.
What resolutions does Veo 3.1 support?
720p, 1080p, and 4K. It's the first mainstream AI video model with true 4K output. Available in 16:9 landscape and 9:16 vertical formats.
How much does Veo 3.1 cost on fal.ai?
Pay-per-second with no minimums. Standard: $0.20/s (720p/1080p) or $0.40/s (4K) without audio, $0.40/s or $0.60/s with audio. Fast: $0.10/s (720p/1080p) or $0.30/s (4K) without audio, $0.15/s or $0.35/s with audio. A 5-second 1080p video with audio costs $2.00 on Standard or $0.75 on Fast.
How do I get started with the API?
Install the fal.ai SDK (Python or JavaScript), grab an API key from your dashboard, and make your first request in three lines of code. The API is serverless, so no GPUs to manage, no infrastructure to set up. Check the API documentation for all available parameters.
Can I use Veo 3.1 for commercial projects?
Yes. Videos generated through the fal.ai API can be used in commercial projects. Check fal.ai's terms of service for full details on usage rights and licensing.
Ready to create?
Start generating cinema-quality AI video with Veo 3.1 on fal.ai.

