Google presented the next version of its major Gemini language model on Thursday: Gemini 1.5.
Gemini 1.5 Pro is the first Gemini 1.5 model.
It’s a medium size multimodal AI model which performs at similar levels to version 1.0 Ultra released in early February, but uses less compute, according to the cloud provider.
Gemini 1.5 Pro comes with a standard content window of 128,000 tokens, which specifies the textual range that a large language model (LLM) can process. However, developers and enterprise customers can try version 1.5 with a pop-up of up to one million tokens via AI Studio and Google’s Vertex AI platform in a private preview.
A large pop-up window
This is the largest pop-up on the market so far. It is about eight times larger than OpenAI’s GPT-4 and five times larger than Anthropic’s Claude 2.1.
The large pop-up is approximately one hour of video, 11 hours of audio, 30,000 lines of code, and 750,000 words.
Gemini 1.5 Pro can analyze, categorize and summarize large amounts of content in a prompt. It can also perform very sophisticated understanding and reasoning for different modalities, including video.
Google’s Gemini update comes a week after renaming its Bard AI Chatbot for Gemini.
This also comes after a year where its competitor Microsoft seemed to be the leader in the generative AI market, notably thanks to its partnership with OpenAI.
However, with its recent Gemini developments, Google is gaining the upper hand.
“Google is now setting the tone for the future of GenAI,” said Chirag Dekate, an analyst at Gartner. “It’s no longer a question of whether Google will catch up. It’s a question of when the others will catch up to Google.”
The 1.5 Pro pop-up aims to address one of the biggest limitations of generative AI systems today, said William McKeon-White, an analyst at Forrester Research.
This challenge lies in the ability of generative AI systems to understand state, a collection of information that indicates where elements of an AI system are at any given time.
While generation augmented by recovery was used to resolve the issue, the limited pop-up has always proven problematic for LLMs.
However, Google’s large pop-up doesn’t completely eliminate the state’s challenge, McKeon-White said. AI models still struggle to store information in a way that can be updated over time but is not ephemeral.
The 1.5 Pro pop-up is also useful because it will match end users’ perceptions to what they think Gemini should be able to do, McKeon-White added.
“It’s able to maintain context, it’s able to maintain previous interactions, relevant responses,” he said. “It is capable of being much more refined and closer to simple passive human perception of context, relevance and understanding.”
Google’s large popup is also important for businesses, as Gemini 1.5’s current one million popup is expandable to 10 million for search and Google might be able to expand it to enterprise versions, a said Constellation Research founder R “Ray” Wang.
“A business user can improve personalization at scale and also scale faster,” Wang said. “Google has achieved this faster, better and hopefully cheaper thanks to its efficient transformer and Architecture of the Ministry of the Environment“.
With the MoE architecture, models are divided into smaller neural networks. This makes the model more effective and relevant based on the information provided.
Beyond innovation
Even though Google’s innovation is impressive and seems hard to beat or match, the cloud provider will still have to prove to enterprise customers how it translates to business use, Dekate said.
“What they need to learn to do effectively is connect the dots on behalf of the client,” he said.
Google will need to show how version 1.5 Pro applies to industries such as insurance, finance and manufacturing.
Chirag DekateAnalyst, Gartner
Microsoft was successful in this area because it quickly made its generative AI technology useful to the enterprise.
“Google needs to make its innovation relevant to the business,” Dekate said. “If they can do that, innovate on behalf of customers and create industry alliances and execution strategies, then they can create a moment of market share change.”
Without it, Google’s innovation with Gemini would be impressive but forgettable, Dekate added.
Google plans to introduce pricing tiers for its standard window of 128,000 contexts and expand it up to one million tokens as it improves the model.
Early testers can try the Million Token pop-up for free.
Esther Ajao is a TechTarget editorial writer covering artificial intelligence software and systems.