Google DeepMind

Gemini Omni

Coming Soon

Public API rolling out in the weeks following Google I/O 2026

Gemini Omni Flash launched on May 19, 2026. LoveGen AI will add it as soon as the public Vertex AI API rolls out.

Published May 12, 2026Updated May 12, 2026

Gemini Omni Flash AI Video Generator

Create and Edit AI Videos with Google's Unified Omni Model

Gemini Omni Flash is Google DeepMind's new unified video generation model, announced and launched at Google I/O 2026 on May 19, 2026. Unlike the dedicated Veo models, Gemini Omni Flash is built on a single transformer-based omni-architecture that natively accepts text, image, audio, and video inputs and produces high-resolution video with synchronized audio in one pass. It supports conversational multi-turn editing — change the camera angle, swap objects, rewrite scenes, or modify backgrounds using plain-language prompts.

Gemini Omni was unveiled at Google I/O 2026, with the first shipping variant — Gemini Omni Flash — rolling out the same day (May 19, 2026). Google describes it as a model that can create anything from any input, starting with video, combining Gemini's reasoning with generative media for stronger world understanding, multimodality, and editing.

At launch, Gemini Omni Flash produces 10-second high-resolution clips paired with native synchronized audio — dialogue with lip-sync, sound effects timed to on-screen action, and ambient background — all generated in a single forward pass. Google has confirmed the 10-second limit is a deployment decision rather than a model constraint. Improved understanding of physics, including gravity, kinetic energy, and fluid dynamics, allows for more realistic motion.

The headline shipping feature is conversational multi-turn editing. Once you have a clip, you describe changes in plain language — "shift the camera angle to the left," "make the sculpture out of bubbles," "when the person touches the mirror, make it ripple like liquid" — and Omni reworks the targeted element while keeping the rest intact. Reference stacking lets you combine a character image, an audio file, and a style reference in a single prompt, and template-based creation with single-click application is built into the Gemini app and Google Flow.

Gemini Omni Flash is rolling out globally to Google AI Plus, Pro, and Ultra subscribers through the Gemini app and Google Flow, and at no cost to users 18+ in YouTube Shorts Remix and the YouTube Create app. Every generated video carries an imperceptible SynthID watermark plus C2PA Content Credentials. Public developer and enterprise API access via Vertex AI is rolling out in the weeks following I/O; LoveGen AI will integrate Gemini Omni Flash as soon as that API becomes publicly available.

How to Use Gemini Omni Flash

Step 1: Choose Your Creation Mode

Generate from a text prompt, animate an image, mix multiple references (image, audio, style), or pick a built-in template for one-click creation.

Step 2: Describe Your Video or Edit

Write a detailed prompt or describe an edit in plain language — Gemini Omni Flash understands camera moves, object swaps, background changes, and style shifts via chat.

Step 3: Generate and Refine

Click Generate. Gemini Omni Flash returns a 10-second high-resolution clip with native synchronized audio. Use multi-turn chat to refine specific elements without starting over.

Gemini Omni Flash Technical Specifications

Provider	Google DeepMind
Release Date	May 19, 2026 (Google I/O 2026)
Variant	Gemini Omni Flash (first shipping model in the Omni family)
Architecture	Unified transformer-based omni model (text + image + audio + video inputs → video + audio output)
Input Modes	Text, image, audio, video — including multi-reference stacking
Output	High-resolution video with native synchronized audio
Max Duration	10 seconds per clip (deployment limit, not model constraint)
Native Audio	Dialogue (lip-sync), SFX, ambient — generated in a single pass
Editing	Conversational multi-turn — camera, backgrounds, objects, actions, style
Physics	Improved gravity, kinetic energy, and fluid dynamics
Provenance	SynthID watermark + C2PA Content Credentials (mandatory)
Availability	Gemini app & Google Flow (AI Plus/Pro/Ultra); YouTube Shorts Remix & Create app (free, 18+)
API Access	Public Vertex AI API rolling out in the weeks following I/O 2026

Why Gemini Omni Flash Stands Out

Unified Omni-Model Architecture

Gemini Omni Flash is Google's first shipping video model built on a unified transformer-based omni-architecture — one model handling text, image, audio, and video in a single pass, eliminating the seams between modalities that separate-pipeline systems introduce. Reference stacking lets you combine a character image, an audio file, and a style reference in a single prompt.

Conversational Multi-Turn Editing

Describe changes in plain language and Gemini Omni Flash applies them directly — shift the camera, swap an object, rewrite a scene, or change a background — while leaving the rest of the clip intact. Multi-turn edits build on prior context so you can iterate without starting over.

Native Synchronized Audio + Improved Physics

Dialogue with lip-sync, on-screen sound effects, and ambient background audio are produced jointly with the video in a single forward pass — no separate TTS or Foley stage. Improved understanding of gravity, kinetic energy, and fluid dynamics delivers more realistic motion, and every output carries SynthID and C2PA provenance.

Gemini Omni Flash vs Other AI Video Generators

Feature	Gemini Omni Flash	Veo 3.1	Sora 2	Grok Imagine
Provider	Google DeepMind	Google DeepMind	OpenAI	xAI
Architecture	Unified transformer omni model	Diffusion	Diffusion	Aurora (autoregressive)
Conversational Editing	Yes — multi-turn	No	No	No
Max Resolution	High-resolution	1080p	1080p	720p
Max Duration	10s (deployment limit)	8s (extendable)	20s	15s
Native Audio	Yes — single pass	Yes	Yes	Yes
Input Modes	Text, image, audio, video	Text, image (up to 3)	Text, image + Cameos	Text, 1 image
Templates	Yes	No	No	No
Provenance	SynthID + C2PA	SynthID	C2PA	—
Availability	Gemini app, Flow, YouTube	Available	Available	Available

What You Can Build with Gemini Omni Flash

Conversational Video Editing

Skip the timeline editor entirely — describe the change you want in plain language and Gemini Omni Flash applies it directly. Shift camera angles, swap objects, change backgrounds, or rewrite an entire action with a single prompt.

Template-Driven Social Content

Pick a built-in template, drop in your prompt, and get a fully composed 10-second clip with synchronized audio — designed for YouTube Shorts, Reels, and TikTok formats with no production experience required.

Dialogue Scene Creation

Generate realistic conversation scenes with accurate lip-sync and ambient audio in a single pass — ideal for marketing scripts, educational content, or short-film dialogue.

Reference-Stacked Generation

Combine a character image, an audio file, and a style reference in a single prompt to generate consistent characters that match a specific look, voice, and aesthetic across clips.

Scene Storyboarding

Rapidly visualize script beats as short clips with native audio. Use multi-turn chat editing to adjust framing, swap objects, or rewrite actions across shots without re-generating from scratch.

Brand Video Production

Use templates for fast branded video creation, then refine with conversational editing — swap product shots, change backgrounds, or adjust the visual tone to match your brand.

Explore Related AI Video Generators

Veo 3.1

Google DeepMind's 1080p video model with frames-to-video and native audio generation.

Sora 2

OpenAI's cinematic video generator with physics-accurate motion and 20s duration.

Grok Imagine

xAI's Aurora-engine video model with Fun/Normal/Spicy style modes and native audio.

Happy Horse 1.0

Alibaba's #1-ranked video model with cinematic motion quality and 7-language lip-sync.

Seedance 2.0

ByteDance's video model with web search integration and synchronized audio.

Kling 3.0

Director-grade 4K video with multi-shot AI cinematics and native audio.

Frequently Asked Questions About Gemini Omni Flash

What is Gemini Omni Flash?

Gemini Omni Flash is Google DeepMind's new unified video generation model, announced and launched at Google I/O 2026 on May 19, 2026. It is the first shipping model in the Gemini Omni family — built on a single transformer-based omni-architecture that natively handles text, image, audio, and video inputs and produces high-resolution video with synchronized audio in a single pass. Headline features include conversational multi-turn editing, improved physics understanding, and reference stacking.

How is Gemini Omni Flash different from Veo 3.1?

Veo 3.1 is a dedicated video diffusion model focused purely on text- and image-to-video. Gemini Omni Flash is built on a unified transformer-based omni-architecture — one model handling text, image, audio, and video in a single pass, similar in concept to GPT-4o — and it ties video generation to Gemini's reasoning. That unlocks conversational multi-turn editing, reference stacking, and template-driven creation that Veo 3.1 does not offer. Veo 3.1 currently provides longer clips and richer multi-image input control.

What is conversational editing in Gemini Omni Flash?

Once you have a clip, you describe changes in plain language — "shift the camera angle to the left," "make the sculpture out of bubbles," "swap the red cup for a coffee mug," or "rewrite this scene so the character is outside" — and Gemini Omni Flash reworks the targeted element while keeping the rest intact. Multi-turn edits build on prior context so you can iterate without restarting. Editing audio on existing videos is deliberately withheld at launch.

Does Gemini Omni Flash generate synchronized audio?

Yes. Gemini Omni Flash produces native synchronized audio — dialogue with lip-sync, sound effects timed to on-screen action, and ambient background — in a single forward pass alongside the video, with no separate TTS or Foley stage. All generated output is automatically tagged with a SynthID watermark and C2PA Content Credentials.

When will Gemini Omni Flash be available on LoveGen AI?

Gemini Omni Flash launched on May 19, 2026 inside the Gemini app, Google Flow, YouTube Shorts Remix, and the YouTube Create app. Public developer and enterprise API access via Vertex AI is rolling out in the weeks following Google I/O 2026. LoveGen AI will integrate Gemini Omni Flash as soon as that API becomes publicly available.

What video templates does Gemini Omni Flash include?

Gemini Omni Flash ships with template-based video creation, applied with a single click inside the Gemini app and Google Flow. Templates handle composition, pacing, and audio for quick generation, and a custom AI avatar creation flow is also available. The current template catalog lives inside the Gemini app and Flow product surfaces.