google/gemini-omni-flash/edit

Edits generated video across multiple conversational turns while preserving scene coherence. Applies iterative changes through natural-language instructions without regenerating the full sequence from scratch.
Inference
Commercial use
Partner

Input

Type # to reference inputs.

Result

Idle

What would you like to do next?

Billing is based on total token consumption. Input tokens (text/audio/video) cost $1.875 per 1 million tokens. Output tokens cost $21.875 per 1 million tokens. For 720p video this costs approximately $0.13 per second of video.

Logs