Google Gemini AI: Video shows how the AI model can see, talk, reason and play just like humans

To demonstrate the multi-modal prowess, Google released a demo to showcase how Gemini cannot only understand direct prompts but also understand the underlying meaning and purpose of certain actions

Join Our WhatsApp Channel

Danny D'Cruze

New Delhi,
Updated Dec 7, 2023 2:08 PM IST

Google Gemini AI: Video shows how the AI model can see, talk, reason and play just like humans

Google showcases Gemini's ability to identify visual input

SUMMARY

The video begins with Gemini identifying a piece of paper and a squiggly line drawn on it
It then correctly identifies a blue duck and a rubber duck, demonstrating its ability to recognise objects in both real life and images
Gemini then plays a game of 'guess the country' with its human partner

Google's new AI model called Gemini is making waves in the tech world with its ability to interact with the real world through sight, sound, and touch. In a recent video demonstration, Google showcased Gemini's capabilities by playing a variety of games and interacting with objects. The pre-recorded interaction seemed almost natural and the AI model was able to make complex deductions in the video. The search giant released the new Gemini model on Wednesday and claims it is the most powerful AI model that can beat the likes of GPT-4 that powers OpenAI’s ChatGPT.

Google emphasises the fact that Gemini is natively multi-modal, unlike other AI models and tools. What this means is that the AI can use mediums like audio, video, and text as inputs in tandem and provide responses that are more natural and human-like. Google claims most other similarly functioning AI models are multi-modal on the surface and don’t have the same capability of seeing, listening, analysing as Gemini.

A new viral video shows Gemini’s capabilities

To demonstrate the multi-modal prowess, Google released a demo to showcase how Gemini cannot only understand direct prompts but also understand the underlying meaning and purpose of certain actions. The video begins with Gemini identifying a piece of paper and a squiggly line drawn on it. It then correctly identifies a blue duck and a rubber duck, demonstrating its ability to recognise objects in both real life and images.

Gemini then plays a game of "guess the country" with its human partner. It correctly identifies Australia based on clues about kangaroos, koalas, and the Great Barrier Reef.

Gemini Pro has been integrated in Bard chatbot

The video also shows Gemini playing games like "rock paper scissors" and "guess which hand the coin is in." It demonstrates its ability to understand and respond to human gestures and language.

In another segment, Gemini helps its partner create art by suggesting ideas based on colors and shapes. It shows its potential for use in creative applications.

The video concludes with Gemini identifying constellations and describing a drawing of the constellation Gemini. This demonstrates its ability to understand and process complex visual information.

Also read: Google reveals its most powerful AI model Gemini which outperforms most human experts, GPT-4 in benchmarks

Also read: McDonald’s to use Google’s generative AI but will it make your burgers, fries fresher and hotter?

For Unparalleled coverage of India's Businesses and Economy – Subscribe to Business Today Magazine

Published on: Dec 7, 2023 2:08 PM IST

COMPANIES

NEWS

Google Gemini AI: Video shows how the AI model can see, talk, reason and play just like humans

To demonstrate the multi-modal prowess, Google released a demo to showcase how Gemini cannot only understand direct prompts but also understand the underlying meaning and purpose of certain actions

TOP STORIES

TOP VIDEOS