google/gemini-omni-flash/edit
Edits generated video across multiple conversational turns while preserving scene coherence. Applies iterative changes through natural-language instructions without regenerating the full sequence from scratch.
Inference
Commercial use
Partner
Input
Type # to reference inputs.
Hint: Drag and drop video files from your computer, video from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: mp4, mov, webm, m4v, gif
Result
Idle
What would you like to do next?
Billing is based on total token consumption. Input tokens (text/audio/video) cost $1.875 per 1 million tokens. Output tokens cost $21.875 per 1 million tokens. For 720p video this costs approximately $0.13 per second of video.