With the launch of Gemini, Google hopes to take the lead in the AI ​​race

Google will begin deploying Gemini on Wednesday, its new artificial intelligence (AI) model expected to allow it to better compete with OpenAI (the creator of ChatGPT) and Microsoft, from applications for the general public to computing capabilities for businesses.

“It is our most consistent, most talented and also most general AI model,” assured Eli Collins, a vice-president of Google DeepMind, the AI ​​research laboratory of the Californian group, during a presentation to the press.

He then released a video where a user shows objects, drawings and videos to Gemini. The AI ​​system comments orally on what it “sees”, identifies objects, plays music and answers questions requiring a certain degree of analysis, justifying its “reasoning”.

For example, when faced with the image of a rubber duck that must choose between two paths — the one on the left leading to another duck drawn on the paper and the one on the right leading to a menacing-looking bear — Gemini suggests the path left, because “it is better to make friends rather than enemies”.

The video also demonstrates that Gemini can recognize references with very little context, such as a scene from the film Matrix played by a person pretending to dodge bullets in slow motion.

Reasoning

The new model is “multimedia from the outset, it has sophisticated reasoning capabilities and it can code at an advanced level,” detailed Eli Collins.

According to him, Gemini is the first AI model to outperform human experts in an industry standard test, the “MMLU”, which is used to evaluate the abilities of these computer programs to reason in different fields, from mathematics to science. history and law.

Since the launch of ChatGPT a year ago, the giants of Silicon Valley have been engaged in a frantic race for so-called generative AI, which makes it possible to obtain texts, images or lines of code of a level equivalent to those produced by humans, upon simple request in everyday language.

Google, the AI ​​leader taken by surprise by the phenomenal success of ChatGPT, responded in particular with its own chatbot, Bard.

But everything plays out at the level of the models, the computer systems which underpin these applications, first stuffed with texts collected online, and now fed with all kinds of data to process requests containing images and discuss orally with its users .

OpenAI said in September that it had equipped ChatGPT with speech and vision to make it “more intuitive.”

“Best collaborator”

Gemini, “this is one more step towards our vision: to bring you the best AI collaborator in the world”, for her part underlined Wednesday Sissie Hsiao, Google vice-president in charge of Bard.

Bard must gain capacity from Wednesday, but always with written requests, and only in English.

You will have to wait until 2024 for other functions and formats, such as advanced help in solving math problems.

Less known than ChatGPT, Bard has the opportunity to try to regain ground on its rival, a victim of its success: in mid-November, overwhelmed by demand, OpenAI actually paused subscriptions to the paid version.

Google will also provide access on December 13 to a first version of Gemini to its customers in the cloud (remote computing), including developers who use its Vertex AI platform to create their own AI applications.

In this area, the Internet giant is in direct competition with Microsoft, the main investor in OpenAI and world number 2 in the cloud, behind Amazon.

The two American groups spent the year adding generative AI tools to their respective software (search engine, office and productivity software, cloud platform, etc.)

“This new era of models represents one of the greatest scientific and technical efforts we have undertaken as a society,” said Sundar Pichai, the boss of Google, quoted in a press release.

To watch on video


source site-42

Latest