Google is revealing its replacement barely two months after releasing Gemini, a big language model that the company hopes will propel it to the top of the AI market. In preparation for an upcoming full consumer deployment, Google is now making Gemini 1.5 available to developers and enterprise users. The company is fully committed to using Gemini as a personal assistant, business tool, and all-in-between, and it is making significant progress toward that goal.
Gemini 1.5 has numerous enhancements: The general-purpose Google system model, called Gemini 1.5 Pro, outperformed Gemini 1.0 Pro on 87% of benchmark tests, seemingly matching the performance of the company’s newly released high-end Gemini Ultra. It was created using a method that is becoming more and more popular called “Mixture of Experts,” or MoE. This method allows the model to execute only a portion of the entire model in response to a query, as opposed to processing the entire thing at once. (This is a helpful explanation of the topic.) This method ought to speed up the model’s operation for you and increase its efficiency for Google.
However, one new feature in Gemini 1.5 has the entire business buzzing, beginning with CEO Sundar Pichai: Gemini 1.5 can handle substantially larger queries and look at a lot more information at once because of its massive context window. Compared to 128,000 for OpenAI’s GPT-4 and 32,000 for the current Gemini Pro, that window is a massive 1 million tokens. Pichai simplifies the difficult-to-understand token statistic (here’s a nice breakdown): “There are tens of thousands of lines of code in about 10 or 11 hours of video.” You can ask the AI bot questions about all of that stuff at once thanks to the context window.
According to Pichai, Google researchers are exploring a context window with 10 million tokens, which is equivalent to watching the entire season of Game of Thrones at once.
Pichai casually mentions that the whole Lord of the Rings trilogy fits inside that context window. “Has this already happened? This seems too specific.” Somebody at Google is just keeping an eye out for continuity flaws in Gemini, attempting to decipher Middle-earth’s convoluted pedigree, and perhaps even seeing whether AI can finally make sense of Tom Bombadil. Pichai laughs and says, “I’m sure it has happened or will happen—one of the two,” says Pichai.
According to Pichai, businesses will find the expanded context window to be extremely beneficial. According to him, there are use cases where adding a lot of personal context and information at the time of the inquiry is possible. “Consider it as we have significantly increased the query window.” He imagines firms employing Gemini to comb through mountains of financial documents; filmmakers uploading their entire films and asking Gemini what reviews would have to say. He says, “I think it’s one of our bigger breakthroughs.”
Gemini 1.5 will only be accessible to developers and business users for the time being through Google’s AI Studio and Vertex AI. It will eventually replace Gemini 1.0. The normal version of Gemini Pro, which is accessible to all users through the company’s apps and on gemini.google.com, will be 1.5 Pro, with a context window with 128,000 tokens. To access the million, you will need to pay more money. Additionally, Google is evaluating the model’s morality and safety, especially in light of the recently expanded context window.
Google is racing against the clock to create the greatest AI tool available, while companies everywhere are attempting to choose their own AI strategy and whether to sign developer agreements with Google, OpenAI, or another company. OpenAI only revealed “memory” for ChatGPT this past week, and it seems like it’s getting ready to launch a web search campaign. Although there is still much to be done on all fronts, Gemini appears to be excellent so far, particularly for those who are already a part of Google’s ecosystem.
Pichai assures that customers won’t care all that much about these 1.0s, 1.5s, Pros, Ultras, and corporate conflicts in the long run. He claims that “people will just be consuming the experiences.” “It’s similar to using a smartphone without constantly focusing on the processor underneath.” However, he notes that since it matters, we’re still in the stage when everyone is aware of the chip in their phone. He claims that “the underlying technology is shifting so fast.” “People genuinely care.”