BipBiz

collapse
Home / Daily News Analysis / Google’s new anything-to-anything AI model is wild

Google’s new anything-to-anything AI model is wild

May 26, 2026  Twila Rosenbaum  8 views
Google’s new anything-to-anything AI model is wild

Google's Omni AI Model: From Stuffed Animals to Deepfakes

Google has unveiled its latest generative AI model, Omni, which it bills as an anything-to-anything system capable of transforming any type of input—photos, videos, text—into any desired output. Currently, Omni Flash is available in Google's Flow platform for video generation and editing, replacing the older Veo model. This new model promises improved consistency, better real-world knowledge, and the ability to maintain character identity across clips. But just how well does it perform? A hands-on test involved creating videos of a stuffed deer named Buddy and even deepfaking the tester herself.

The results are a mixed bag. Some clips were notably more consistent and aligned with prompts than earlier versions of Veo. For example, when prompted to show Buddy packing for a cruise and then using a jar of honey as sunscreen, the model created a coherent narrative. However, the bottle of honey constantly changed shape and color throughout the scene, revealing the underlying instability of AI-generated imagery. Similarly, when asked to remove antlers that appeared on Buddy (a baby deer without antlers), Omni complied for that specific frame but then added antlers to all subsequent clips—a classic example of AI hallucination.

Despite these quirks, Omni's ability to edit existing videos is a significant step forward. Users can upload a video and provide text prompts for edits, and Omni generally incorporates those changes more faithfully than its predecessor. Yet the quality remains inconsistent. Facial expressions and subtle movements often appear unnatural, and occasional 'jump scares' occur when objects or characters suddenly change orientation. The model also struggles with long sequences, as evidenced by a final frame that seemed to randomly combine elements from the preceding clip.

Deepfakes: Disturbingly Realistic

The most startling feature of Omni is its capacity to generate deepfakes using a single selfie video. The tester uploaded a short video of herself with a neutral expression and prompted Omni to create clips of her eating spaghetti, sitting on an airplane, and posing in front of the Eiffel Tower with a baguette. The results were convincing enough to fool her husband, who has seen her daily for years. He noticed only an unfamiliar bowl as a clue; the act of eating pasta appeared authentic. The Eiffel Tower clips had a slightly cartoonish quality in some versions, but one was so realistic that it would likely pass casual inspection on social media.

Such realism comes with troubling implications. While there are still minor AI tells—like a fork that clinks too perfectly or a background character appearing twice—these flaws are becoming harder to spot. The technology lowers the barrier to creating convincing fake videos, raising concerns about misinformation, identity theft, and privacy. The tester herself felt unsettled watching the deepfakes, noting that she knew it wasn't her but doubted others would.

Cost and Accessibility

Omni is not free. Generating videos consumes credits, with prices ranging from 15 to 40 credits per clip depending on length and complexity. Editing runs at 40 credits. Google's $20-per-month AI Pro plan provides 1,000 credits monthly, which the tester exhausted after roughly 20 clips and a few edits. For casual users, this might be manageable, but anyone seeking precise control over a final video could face significant costs from repeated iterations.

Google positions Omni as a tool for creative expression, but the persistent errors and unexpected results mean that generating a professional-looking video often requires many attempts. The tester found that editing existing clips was still buggy, so starting fresh with a new prompt was often easier. Despite these imperfections, the model's ability to produce realistic footage with minimal effort is unprecedented.

The broader context of AI video generation includes other players like OpenAI's Sora, Runway, and Pika. Omni's unique selling point is its multimodal input, allowing users to guide generation with videos and text simultaneously. Google also claims superior character consistency, but this varies widely. The technology is evolving rapidly, and Omni may soon be supplanted by even more capable models. For now, it represents a landmark step toward the 'anything-to-anything' vision, though the singularity remains distant.

From a technical standpoint, Omni likely uses a diffusion-based architecture similar to other video generation models but trained on vast datasets to understand real-world physics and relationships. Its ability to maintain object identity across frames is a common challenge—the honey bottle inconsistency shows that temporal coherence is still weak. Deepfake generation, on the other hand, benefits from the model's ability to map facial features onto different body poses and backgrounds, leveraging learned representations from millions of human images.

The ethical considerations are paramount. With such easy-to-use deepfake capabilities, the line between real and generated content blurs further. Google has implemented safeguards, but determined users can still create deceptive videos. The tester noted that while she used the tool for harmless fun, the potential for abuse is clear. She refrained from sharing the deepfake videos widely, conscious of how they could be misused.

In terms of user experience, Flow is a polished platform that integrates with Google's ecosystem. Creating a video involves uploading a base image or video, typing a prompt, and waiting 30–60 seconds for generation. The interface is intuitive, but the credit system may deter extensive experimentation. For professionals, the cost might be acceptable, but for hobbyists, the limitations could be frustrating.

Comparing Omni to its predecessor Veo, the improvements are real but incremental. Five months ago, the same user found Veo nearly unusable for video editing; now Omni can actually execute some prompts correctly. However, the overall reliability remains below what Google's marketing suggests. The model still struggles with long-range consistency, physics, and subtle human nuances like blinking or hand movements.

Looking ahead, Omni is likely a stepping stone to more robust models. Google's research division continues to publish on video generation, and the integration with other AI services (like Gemini) suggests a future where users can seamlessly combine text, image, audio, and video generation. The 'anything-to-anything' promise may eventually be fulfilled, but current limitations show that we are still in the early stages.

For now, Omni offers a glimpse into a world where anyone can create synthetic videos with minimal effort. Whether for entertainment, education, or deception, the tool is remarkable and unsettling in equal measure. The tester's final reflection sums it up: we are deep in the uncanny valley, and the view from there is disorienting.


Source: The Verge News


Share:

Your experience on this site will be improved by allowing cookies Cookie Policy