Google Gemini (formerly Bard): What It Is And How To Use It?

Google Bard has been renamed in Google Gemini. Learn what is Google Gemini, important updates and how to use it.

Google Gemini Presentation from Google DeepMind Website

What is Gemini (ex Google Bard)? Google Gemini is a Generative AI conversational model similarly to OpenAI's ChatGPT Generative AI model which is used for chatbots. Gemini comes from next-gen GenAI models family developed by Google DeepMind a research company funded by Google. Google Gemini comes in three different versions:

Gemini 1.5. Announced on February 15 2024 in a statement from Google' CEO Sundar Pichai and Google DeepMind CEO Demis Hassabis, Gemini 1.5 it's the latest next-generation model, and currently, is the most performant model in the Generative AI space. It delivers dramatically enhanced performance. For early testings, Gemini 1.5 comes in the mid-size version multimodal model called Gemini 1.5 Pro and it performs at a similar level to Gemini 1.0 Ultra which was the most performant model before Gemini 1.5. These are the mind blowing statistics Gemini 1.5 Pro has: It can process 1M tokens in input, 1 hour video, 11 hours audio, more than 700k words and more than 30k lines of code in input. Gemini 1.5 Pro is already available for a limited group of developers and enterprise customers via AI Studio and Vertex AI in private preview.

Shantanu Kumar ChatGPT via Pexels

Gemini 1.0 Ultra. It was Google DeepMind largest and most capable Generative AI model for highly complex tasks before Gemini 1.5 was released.
Gemini 1.0 Pro. It's the best model for scaling across a wide range of tasks.
Gemini 1.0 Nano. It's the most efficient model for on-device tasks.

What Is Google Bard?

Google Bard was an artificial intelligence (AI) multimodal model which is powered by LaMDA with conversational capabilities. Google Bard was introduced on February 06 2023 by a statement from Google CEO Sundar Pichai as an experimental conversational AI service. Google Bard was later rebranded in Google Gemini on February 2024.

How Does Gemini Work?

Gemini is built upon a sparse mixture-of-experts (MoE) Transformer-based architecture. This architecture allows the model to scale efficiently by directing inputs to a subset of the model's parameters for processing, enabling conditional computation. This approach significantly enhances the model's ability to handle long contexts, up to at least 10 million tokens, without a proportional increase in computational requirements. As reported from Google DeepMind's research paper, the model benefits from advances in training and serving infrastructure, allowing it to push the boundaries of efficiency and long-context performance. It is trained on Google’s TPUv4 accelerators, distributed across multiple datacenters, utilizing a variety of multimodal and multilingual data. This includes web documents, code, and incorporates image, audio, and video content.

Google Bard Integration Across Google Workspace, image from Google

What Are The Capabilities Of Gemini?

Gemini 1.5 Pro demonstrates exceptional capabilities in processing and understanding long-form mixed-modality inputs. It can comfortably process entire collections of documents, multiple hours of video, and almost a day's worth of audio recordings. The model achieves this by extending the context length it can handle by over an order of magnitude compared to existing models, maintaining high performance even as the context window increases. Some of the performance benchmarks described in the report by Google DeepMind are the following:

Near-Perfect Recall: Gemini 1.5 Pro achieves near-perfect recall (>99%) on long-context retrieval tasks across different modalities. This level of performance is a significant leap over existing models, such as Claude 2.1 and GPT-4 Turbo.
State-of-the-Art in QA and ASR: The model sets new standards in long-document question answering (QA), long-video QA, and long-context automatic speech recognition (ASR), outperforming all competing models across these tasks.
Surprising New Capabilities: Gemini 1.5 Pro demonstrates the ability to learn new languages from minimal instruction, translating English to Kalamang—a language with fewer than 200 speakers—based solely on a grammar manual provided in its context.

What Can Gemini Do?

Gemini understands text, images, video, audio and more at the same time, it can explain well complex subjects like math and physics. Here are two key features of Google Gemini:

Advanced Coding: Gemini excels in several important industry-standard coding benchmarks for evaluating performance on coding tasks including HumanEval and Natural2Code. Gemini can explain and generate high-quality code in the world's most popular languages like Python, Java, C++ and Go.
Reasoning: Gemini can help make sense of complex written and visual information. It can extract insights from hundreds of thousands of documents through reading, filtering and understanding information uncovering fundamental information that can be difficult to discern amid vast amounts of data.

Gemini vs. GPT-4

Google launched Gemini as a direct competitor to OpenAI ChatGPT. These are some of the main differences in the scores achieved on a range of benchmarks including text and coding between Google Gemini and OpenAI ChatGPT GPT-4

	Google Gemini	OpenAI GPT-4 - GPT-4V
General	90%	87.29%
Reading comprehension	82.4%	80.9%
Reasoning Commonsens	87.8%	95.3%
Math Problems	53.2%	52.9%
Python Code Generation	74.4%	67.0%
Natural Image Understanding	77.8%	77.2%
Document Understanding	90.9%	88.4%
Infographic Understanding	80.3%	75.1%
English Video Captioning	62.7%	56.0%
Video Question answering	54.7%	46.3%
Automatic speech translation	40.1%	29.1%

How To Use Gemini?

Users can start using Gemini Ultra also called Gemini Advanced by subscribing to the new Google One AI Premium Plan which offers Gemini in Gmail, Docs, Slides, Sheets and Meets in more than 150 countries in English and 2TB of Storage and other Google One benefits for $19.99/month. Users which are not yet Google One AI Premium member, can start a two-month trial at no cost today. Users can also access for free to Gemini Pro 1.0 and start writing, planning, learning and more.

Top