OpenAI brings vision and voice to ChatGPT

(San Francisco) OpenAI presented on Monday a new version of ChatGPT which can now hold oral and fluid conversations with its users, a further step towards ultra-sophisticated artificial intelligence (AI) assistants, the current holy grail of Silicon Valley .


Thanks to a new model, GPT-4o (“o” for “omni”), ChatGPT will be able to understand text, sound and images, and respond to writing, by voice or by generating images .

These new capabilities will be gradually added to ChatGPT, initially text and image for paid subscribers, as well as free users, with limits on usage. The new version of “Voice Mode” should arrive in the coming weeks for subscribers.

It allows you to reproduce discussions between humans in a stunning way.

In a live video demonstration, ChatGPT read users’ facial emotions via a smartphone camera, guided them through breathing exercises, told them a story, and helped them solve a problem. mathematical. Above all, users can easily interrupt it.

“You look happy.” […] Do you want to tell me what is the source of all this good humor? », asked the machine to an OpenAI engineer, who replied that he was showing the public how “useful and fabulous” it was. “Oh stop, you’re making me blush,” she exclaimed back.

“Prophetic”

At the end of 2022, with the launch of ChatGPT, which generates content on a simple request in everyday language, OpenAI put generative AI on track, a revolution that took all the technological giants by surprise.

Since then, the entire Silicon Valley has embarked on a race for ever more efficient AI tools and assistants. Google is due to present its latest innovations on Tuesday, while Microsoft, OpenAI’s main investor, has planned an event for the press and developers next week.

On Friday, Sam Altman, the boss of OpenAI, denied rumors about the announcements his company was preparing. “Not GPT-5, not a search engine,” he declared on X. “But […] we have been working on some new things and we think people will love it,” he added. “For me, it’s like magic. »

In the past, he had confessed to loving the science fiction film Herwhere a man falls in love with an AI, by speaking with her.

“It was incredibly prophetic,” he said last September at a conference. “And it inspired us in more than one way, […] notably the idea that we all have a personalized agent who tries to help us. »

ChatGPT is still a long way from the omniscient, proactive, personalized AI agents that companies promise. But this update has impressed, or worried, industry experts.

“Anthropomorphization”

“I was struck to what extent the demonstrations anthropomorphize the models,” responded Jeff Boudier, of Hugging Face, for AFP. “It creates confusion and false expectations.”

“People risk projecting qualities onto models, and becoming emotionally attached. They will not understand why models can create false information, nor know in which situations they can trust them or not,” explained the product manager of this collaborative and open generative AI platform.

Sam Altman regularly promotes his vision of an AI which will one day be “general”, that is to say equipped with human cognitive capacities, capable of achieving scientific breakthroughs in the service of humanity.

The OpenAI company, initially created as a non-profit research laboratory, has been valued at some $80 billion, according to the New York Times, during a securities sale last February. And according to the Financial Timesits annualized revenues have been around $2 billion since December 2023.

“A very important part of our mission is to make all of our advanced AI tools freely available to the public [pour que] people intuitively understand what technology can do,” said Mira Murati, technology director of the start-up Californian, during Monday’s presentation.

“This is the first time we’ve taken a big step forward in ease of use,” she added. “This is extremely important, this is the future of interaction between us and machines.”


source site-55