Up until recently, you may not have heard of Google Gemini but Google’s Bard chatbot might have crept into your atmosphere. But, a switcheroo has put the spotlight back on Gemini and you’re certain to hear about it more and more over the coming years.
The Gemini 1.0 model, which was initially previewed at Google I/O in May, is more powerful than the existing technology and potentially more equipped to go up against OpenAI’s ChatGPT model used by Microsoft. Google’s tests say it’s more powerful.
That’s some of the key context, but let’s get into the basics.
What is Google Gemini?
On February 8th 2024, Google announced that Bard would now become Gemini, with the chatbot adopting the name of the AI model that powers it. As such, Google Gemini now refers to both the model and the public-facing chatbot.
Google says Gemini is its “largest and most capable AI model” and it’ll be responsible for powering everything from Bard to the Google Pixel range of smartphones.
The company says the key to Gemini and the “multimodal” AI model. That means it can “generalise and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video.” Previous iterations achieved more limited capabilities by stitching models together. Gemini is natively multimodal.
Three different sizes for Gemini
Google says Gemini 1.0 is the first version of the model, as the numbering convention would suggest. There are tiers built for different purposes. All of them benefit from the multimodal design and their purposes are detailed below.
Nano, for example, will be great for on-device AI and will soon be available on the Pixel 8 Pro, while Ultra is geared for extreme use cases like data centres. Pro is the happy middle ground, which Bard is going to be powered by from today.
- Gemini Ultra — our largest and most capable model for highly complex tasks.
- Gemini Pro — our best model for scaling across a wide range of tasks.
- Gemini Nano — our most efficient model for on-device tasks.
Gemini Pro has been is available to use within Google Bard since December and remains the default for standard Google Gemini use, while Gemini Nano launched on the Pixel 8 Pro as part of the December Feature Drop.
Save 23% on the Google Pixel Watch 2
The Google Pixel Watch 2 has just hit is lowest price yet, dropping 23% to a price of £270.
- Save 23%
- Now £270
Google adds: “We’re also bringing Gemini to Pixel. Pixel 8 Pro is the first smartphone engineered to run Gemini Nano, which is powering new features like Summaride in the Recorder app and rolling out in Smart Reply in Gboard, starting with WhatsApp — with more messaging apps coming next year.”
On February 8th, Google launched Gemini Advanced, giving users access to the Ultra 1.0 model. Google says this is it’s “largest and most capable state-of-the-art AI model.” Advanced is aimed at tackling more complex tasks like coding, logical reasoning, following nuanced instructions and collaborating on creative projects. Ultra 1.0 also better understands context, based on previous conversations. Gemini Advanced is available in English across 150 countries and territories. It will be expanded to more languages in the future.
You can get access to Google Gemini Advanced by signing up to Google One AI Premium Plan. The plan costs £18.99/month ($19.99/month). Along with access to Gemini Advanced, you get 2TB Google Drive storage and “access to other Google One benefits”.
Gemini and Gemini Advanced are rolling out on Android via the Google Assistant, letting you choose between the AI models and the standard Assistant. For iOS, Gemini is rolling out within the Google app.
Google says Gemini’s performance has been rigorously tested on tasks like natural image, audio and video understanding and mathematical reasoning. Gemini Ultra beats incumbent models on 30 out of 32 academic benchmarks for large language models.
Now, Google says, Gemini is outperforming human experts on 57 subjects when it comes to MMLU (massive multitask language understanding). Gemini scored 90.0% in those tests, while ChatGPT 4 scored 86.4%. That’s a key takeaway.
How and why the next-generation was built
Google says it has approached the training of Gemini differently to previous multimodal functionality. Previously they were trained separately and put together afterwards. That meant it was good at describing images, for instance, but lacked the ability for competent complex reasoning.
Google says Gemini was pre-trained from the start to be natively multimodal and that results in a massive upgrade.
“Then we fine-tuned it with additional multimodal data to further refine its effectiveness,” the company says in the blog post. “This helps Gemini seamlessly understand and reason about all kinds of inputs from the ground up, far better than existing multimodal models — and its capabilities are state of the art in nearly every domain.”