Google’s updates in Vertex AI, Imagen 2 support GenAI apps
Google on Tuesday unveiled several updates to its Vertex AI platform and and updated version of its text-to-image model, Imagen 2.
At its Google Next ’24 conference in Las Vegas, the cloud provider revealed that Gemini 1.5 Pro, its new LLM, is available in public preview on Google’s enterprise AI platform Vertex AI.
Imagen 2, the new version of its image generating model, can now also create four-second live images from text prompts and has new image editing capabilities.
Vertex AI also now has new grounding capabilities, including the ability to ground responses with Google Search. It also has new prompt management and evaluation services for large models. Grounding is the extra step of making sure data is accurate as well as based on trusted responses and something other than the data a model was trained on.
With these new updates and advancements, Google continues the pattern it set at the beginning of the year of advancing its GenAI technologies despite growing competition.
With its newest updates in Vertex around Gemini 1.5 Pro, unveiled in February, Google is aiming to provide tools for enterprises to build around Gemini, said Rowan Curran, an analyst at Forrester Research.
“The ability to have a model with such a huge context window changes the type of use cases and applications you can get around it,” Curran said, referring to Gemini Pro’s large 1 million context window option.
It’s hard for enterprises to build around a model such as Gemini if they don’t have tools for prompt managing or testing new responses, he added.
“It’s the totality of tooling that they’re advancing and introducing around supporting generative AI specifically,” he said.
The various tools and capabilities support the Gemini family of models and open a new set of possibilities for how enterprises can apply and build with generative AI, he continued.
Updates in Vertex AI
One way Google provides support for this is by including new prompt management and evaluation services for large models such as Gemini 1.5 Pro in Vertex AI.
The new services let users organize, track and modify prompts for machine learning models.
“This advantage streamlines the process of creating, editing and managing prompts,” Futurum Group analyst Paul Nashawaty said.
Moreover, for enterprises looking to build GenAI applications, prompt management and evaluation services will be crucial because of the capacity to evaluate previous prompts and responses, Curran said.
“You need to have that ability to log those queries and responses because there’s no way to have to regenerate them precisely in the future,” Curran added.
Meanwhile, with the new grounding capabilities in Vertex AI now in preview, users can ground LLM responses with Google search or enterprise data sources using retrieval augmented generation. RAG optimizes the output of LLMs.
The new grounding function is promised to reduce LLM hallucinations, Nashawaty said.
“That is the primary advancement,” he said. “By doing so, enterprises can increase their use of LLMs with confidence.”
Grounding helps enterprises ensure what the AI systems comprehend or interact with in the real world is accurate, Gartner analyst Sid Nag said.
“It’s a bridge between abstract AI concepts and practical, tangible outcomes,” he said.
It brings real-world accuracy but adds human sentiment analysis that helps users avoid errors that could arise due to simulated data, he added.
The grounding technology comes as more companies are committing to using Google Search to support the grounding of generative models, Curan said.
“More and more enterprises just want to ground the responses of large language models in their data,” he said.
The popularity of RAG has led to recent developments for Google’s competitors, such as Microsoft, which recently unveiled changes to Azure AI Search that let customers run RAG at any scale.
For enterprises looking to build AI applications in the future, they will need predictive AI to understand customers’ likelihood to act, generative AI to understand customers’ intent based on natural language or images, and search tools to retrieve or find the correct information, he said.
“I see the future of this as being a nice trifecta or tripod built upon predictive, generative and search,” he continued.
Other updates in Vertex AI include Gemini 1.5’s pro ability to process audio streams, including speech and audio portions of videos.
Google also revealed that the Anthropic Claude 3 family of models is available on Vertex AI.
Open models such as Llama 2, Mistral 7B and Mixtral 8 are also available in Vertex AI.
Imagen 2 new capabilities
Introduced in preview, Imagen 2 includes text-to-live image capabilities that let marketing teams generate GIFs and video loops from text prompts. Imagen 2 also has advanced photo editing capabilities.
Imagen 2 comes after Google paused the new image generation feature for its Gemini conversational app, formerly known as Bard, after the app generated inaccurate images of historical figures.
Rowan CurranAnalyst, Forrester Research
Benefits of Imagen 2’s live image capabilities include a faster creation process, reduction in human error, and the ability to minimize tedious tasks, Nashawaty said.
Disadvantages include a steep learning curve for users and the cost of implementation, he said. And there are similar models in the market.
“This isn’t any kind of technological breakthrough,” Curran said.
For example, open source tools, such as Text2Live tools, and Stability AI’s Stable Diffusion 3 also has similar capabilities, Nag noted.
Text-to-live images might also come with privacy problems, he added.
“I don’t know the limits or utilization of text-to-live functionality from a privacy perspective,” Nag said. “If that functionality is restricted to certain types of enterprise-grade workloads, that’s a good thing.”
Google also revealed new partner infrastructure and partner news:
- Cloud TPU v5p, the vendor’s next-generation accelerator for training GenAI models, is now generally available. It is powered by Nvidia H100 Tensor Core GPUs.
- Nvidia Blackwell GPUs are also coming to Google Cloud.
- Nvidia and Google are collaborating to help startups create GenAI applications and services. Members of Nvidia Inception now have a path to use Google Infrastructure.
All the updates from Google show that the GenAI race continues to heat up.
For enterprises, it’s more important to focus on the applications of GenAI than the newest product introductions, Curran said.
“It’s really important to understand how to build things and what are the emerging best practices” he said.
Esther Ajao is a TechTarget Editorial news writer and podcast host covering artificial intelligence software and systems.