What is Google Gemini? Next-gen AI to battle ChatGPT explained

By Adam Speight

Contact via Twitter

Computing Editor

February 14, 2024 4:57 pm GMT

Up until recently, you may not have heard of Google Gemini but Google’s Bard chatbot might have crept into your atmosphere. But, a switcheroo has put the spotlight back on Gemini and you’re certain to hear about it more and more over the coming years.

The Gemini 1.0 model, which was initially previewed at Google I/O in May, is more powerful than the existing technology and potentially more equipped to go up against OpenAI’s ChatGPT model used by Microsoft. Google’s tests say it’s more powerful.

That’s some of the key context, but let’s get into the basics.

What is Google Gemini?

On February 8th 2024, Google announced that Bard would now become Gemini, with the chatbot adopting the name of the AI model that powers it. As such, Google Gemini now refers to both the model and the public-facing chatbot.

Google says Gemini is its “largest and most capable AI model” and it’ll be responsible for powering everything from Bard to the Google Pixel range of smartphones.

The company says the key to Gemini and the “multimodal” AI model. That means it can “generalise and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video.” Previous iterations achieved more limited capabilities by stitching models together. Gemini is natively multimodal.

Three different sizes for Gemini

Google says Gemini 1.0 is the first version of the model, as the numbering convention would suggest. There are tiers built for different purposes. All of them benefit from the multimodal design and their purposes are detailed below.

Nano, for example, will be great for on-device AI and will soon be available on the Pixel 8 Pro, while Ultra is geared for extreme use cases like data centres. Pro is the happy middle ground, which Bard is going to be powered by from today.

Gemini Ultra — our largest and most capable model for highly complex tasks.
Gemini Pro — our best model for scaling across a wide range of tasks.
Gemini Nano — our most efficient model for on-device tasks.

Availability

Gemini Pro has been is available to use within Google Bard since December and remains the default for standard Google Gemini use, while Gemini Nano launched on the Pixel 8 Pro as part of the December Feature Drop.

Save 23% on the Google Pixel Watch 2

The Google Pixel Watch 2 has just hit is lowest price yet, dropping 23% to a price of £270.

Amazon
Save 23%
Now £270

View Deal

Google adds: “We’re also bringing Gemini to Pixel. Pixel 8 Pro is the first smartphone engineered to run Gemini Nano, which is powering new features like Summaride in the Recorder app and rolling out in Smart Reply in Gboard, starting with WhatsApp — with more messaging apps coming next year.”

On February 8th, Google launched Gemini Advanced, giving users access to the Ultra 1.0 model. Google says this is it’s “largest and most capable state-of-the-art AI model.” Advanced is aimed at tackling more complex tasks like coding, logical reasoning, following nuanced instructions and collaborating on creative projects. Ultra 1.0 also better understands context, based on previous conversations. Gemini Advanced is available in English across 150 countries and territories. It will be expanded to more languages in the future.

You can get access to Google Gemini Advanced by signing up to Google One AI Premium Plan. The plan costs £18.99/month ($19.99/month). Along with access to Gemini Advanced, you get 2TB Google Drive storage and “access to other Google One benefits”.

Gemini and Gemini Advanced are rolling out on Android via the Google Assistant, letting you choose between the AI models and the standard Assistant. For iOS, Gemini is rolling out within the Google app.

Super-human performance

Google says Gemini’s performance has been rigorously tested on tasks like natural image, audio and video understanding and mathematical reasoning. Gemini Ultra beats incumbent models on 30 out of 32 academic benchmarks for large language models.

Now, Google says, Gemini is outperforming human experts on 57 subjects when it comes to MMLU (massive multitask language understanding). Gemini scored 90.0% in those tests, while ChatGPT 4 scored 86.4%. That’s a key takeaway.

How and why the next-generation was built

Google says it has approached the training of Gemini differently to previous multimodal functionality. Previously they were trained separately and put together afterwards. That meant it was good at describing images, for instance, but lacked the ability for competent complex reasoning.

Google says Gemini was pre-trained from the start to be natively multimodal and that results in a massive upgrade.

“Then we fine-tuned it with additional multimodal data to further refine its effectiveness,” the company says in the blog post. “This helps Gemini seamlessly understand and reason about all kinds of inputs from the ground up, far better than existing multimodal models — and its capabilities are state of the art in nearly every domain.”

You might like…

By Adam Speight

Contact via Twitter

Computing Editor

Adam is the Computing Editor of Trusted Reviews. He joined as a staff writer in 2019 after graduating from Newcastle University with an MA in Multimedia Journalism. After spending two years at WIRED,…

Why trust our journalism?

Founded in 2003, Trusted Reviews exists to give our readers thorough, unbiased and independent advice on what to buy.

Today, we have millions of users a month from around the world, and assess more than 1,000 products a year.

Editorial independence

Editorial independence means being able to give an unbiased verdict about a product or company, with the avoidance of conflicts of interest. To ensure this is possible, every member of the editorial staff follows a clear code of conduct.

Professional conduct

We also expect our journalists to follow clear ethical standards in their work. Our staff members must strive for honesty and accuracy in everything they do. We follow the IPSO Editors’ code of practice to underpin these standards.

Why trust our journalism?

Founded in 2003, Trusted Reviews exists to give our readers thorough, unbiased and independent advice on what to buy.

Today, we have millions of users a month from around the world, and assess more than 1,000 products a year.

What is Google Gemini?

Three different sizes for Gemini

Availability

Super-human performance

How and why the next-generation was built

You might like…

Apple’s answer to ChatGPT and Bard could arrive next year – report

What is Google Bard? The new features announced at I/O 2023

Google Bard arrives, pitches more responsible AI rival to Microsoft’s ChatGPT tools

Why trust our journalism?

Editorial independence

Professional conduct

Why trust our journalism?

Editorial independence

Professional conduct

Sign up to our newsletter