You don’t wake up a sleeping giant. And yet, this is what OpenAI and its generative AI ChatGPT did two years ago. In catch-up mode last year, Google is going on the offensive this year and promises to deploy Gemini, its own AI, everywhere, everywhere, everywhere.
In any case, this is what emerges from the long opening speech of the Google I / O conference, intended for creators, programmers and users of the Mountain View giant’s products, which took place Tuesday morning in California. “A year ago we created Gemini. Since then, we have done everything so that everyone can use it. Today, more than 2 billion people use it,” summarized the CEO of Google and Alphabet, Sundar Pichai, on stage.
“Multimodal” AI
Did we say everywhere? There will be an improved Gemini app for Android. There will be Gemini in Google text messaging. There are Gemini music creation tools. There are even Gemini tools for generating HD videos aimed at image professionals. And of course, Gemini will power a new version of Google’s search engine — its bread, butter, and of course, each other’s money.
Gemini’s AI-powered online search is Google’s response to pressures that have been coming from all sides for at least a year. First, its Californian rival OpenAI plans to launch its own search engine any day now, supposed to threaten its hegemony. A hegemony already eroded by algorithms that are less and less understood by web professionals whose only job is to position their websites as high as possible in the search engine results.
Launched in beta last year, the AI product recommendation summary function is becoming a reality for Internet users located in the United States. This tool summarizes most of the reviews and comments made about products or services that consumers plan to purchase. In its tests, Google says that not only are buyers finding the right product more easily, but sites that conform to this new formula “generate more clicks than if their page had simply appeared in a list of results.”
The more traditional search for web references will also be improved, Google promises. The results can be refined and formatted according to a specific context and more nuanced questions.
Google also promises multimodality: it will be possible to continue the same search by switching from one device to another. The Mountain View giant even presented a simulation where its search engine creates a podcast from scratch that responds orally to a question written by a young Internet user. Two voices created from scratch explain to him how Isaac Newton schematized gravity using references taken from basketball, a sport that the young man loved.
“We have built immense knowledge made up of billions of details about people, places and things, so that you can get the reliable information you need in the blink of an eye,” promises the head of research at Google, Liz Reid.
From AI to your glasses
OpenAI dreams of a general AI that would combine all human knowledge into a single digital assistant. The firm, whose book value these days reaches US$80 billion, according to some, presented its own strategic update on Monday, the day before Google I/O. It introduced GPT-4o, the “omni” (hence the “o”) version of its advanced GPT-4 language model. This new tool will be accessible more quickly and at a lower cost by creators looking for generative AI to integrate into their applications. GPT-4o can “reason from voice, image or text,” promises OpenAI.
Google has much the same vision, which is called Project Astra. “To be truly useful, an AI agent must understand the world around us, and remember what it sees and hears, to give it the right context in which to act,” explained Eli Collins and Doug on stage Eck, respectively director of products and senior director of research for Google. “It’s easy to imagine a near future in which an expert assistant will be with you at all times, in your phone or your glasses. »
Yes, in your glasses, the new frontier of the computing world, apparently. “People should be able to talk to their assistant naturally, without delay,” adds Google. We already knew that Google wanted to launch new mixed reality glasses, we suddenly wonder if the defunct Google Glass will soon emerge from the cemetery of the many Google products that we considered until now to be a failure.
In the meantime, Google intends to attract people and SMEs who have already adopted ChatGPT (or even Copilot at Microsoft), with a revamped version of Gemini which is already online. Gemini 1.5 is more refined and will be able to adopt roles, and give a precise personality to the dialoguer (chatbot) which provides customer service for commercial websites.
Gemini will also be able to live on a phone, like the Pixel 8a just released by Google this week. Its users will be able to start a discussion with the AI on their mobile, then move it to a PC, or elsewhere, in writing or orally, and request a vocal response, written or why not, in the form of an image . Sundar Pichai cites an example of use in which a person looks at photos of a young girl swimming in a pool. He then asks the AI: “At what age did she start swimming?” How has his progress been? »
The AI, in this scenario, can respond from the images in front of it, and from other sources of personal information it has collected elsewhere. “Gemini will now be able to understand the world the way people do, not just through text, but through vision, sound and language,” Eli Collins later explained.
This is sure to raise questions about privacy, a subject little discussed during Google’s keynote speech.
Never mind. If it was up to Google, AI would be everywhere.
Shipping and accommodation costs for this article were paid by Google Canada.