Google Cloud has integrated its newest AI models, Veo and Imagen 3, into Vertex AI, its platform for AI application development. This development makes Google the first major cloud provider to offer a video generation model to its customers, enhancing its multimodal AI offerings and providing powerful tools for marketing, advertising, and other creative industries. While Imagen 3 will become generally available to all Vertex AI users next week, Veo is currently in private preview.
Veo, first introduced at Google’s I/O developer conference, is Google DeepMind’s advanced video generation model. It transforms text or image prompts into cinematic, high-definition videos exceeding 60 seconds in length, maintaining frame-level consistency for seamless subject movement within shots. This positions Veo as a competitor to models like Runway’s Gen-3 and OpenAI’s Sora.
Imagen 3, also developed by DeepMind, is a text-to-image generation model that produces photorealistic visuals in various styles. Google claims it surpasses previous iterations in detail, lighting accuracy, and reduction of artifacts. Beyond generating images, Imagen 3 offers editing features such as image upscaling, inpainting, outpainting, and background replacement—all guided by text prompts. Users can also provide reference images to create content aligned with specific brand aesthetics, logos, or product features.
By adding Veo and Imagen 3 to Vertex AI, Google Cloud provides organizations with comprehensive tools to innovate in areas like marketing and sales. Imagen 3 simplifies the creation of high-quality assets like product images and social media content. Veo extends these capabilities by enabling teams to convert visuals into polished videos, accelerating production, reducing costs, and speeding up prototyping.
Warren Barkley, senior director of product management at Google, mentioned in a blog post that customers like Agoda are leveraging AI models like Veo, Gemini, and Imagen to streamline video ad production, achieving a considerable reduction in production time. He also highlighted that both models incorporate safety features like digital watermarking and content moderation to mitigate risks associated with generative AI.
Early adopters of these models include companies like Mondelez International—owner of brands such as Oreo, Cadbury, and Milka—and global marketing and communications firm WPP. As these foundation models expand their reach, businesses across various industries have new opportunities to reimagine how they create and deliver visual content.
The competition in the AI space continues to intensify. Shortly after Google’s announcement, Amazon Web Services (AWS) revealed its own video generation model, Nova Reel, at the re:Invent conference. Nova Reel generates six-second-long studio-quality videos from text and image prompts and will be available through Amazon Bedrock, AWS’s fully managed service for generative AI applications.
Meanwhile, Microsoft appears to be trailing in this specific area, as its AI Foundry does not currently include models for video generation. However, this may change with the anticipated release of OpenAI’s Sora.
Featured image courtesy of Medium
Leave a Reply