The successful software ChatGPT, which generates texts using artificial intelligence, obtained a score approaching that required to pass a difficult medical examination in the United States, according to a study published Thursday.
• Read also: Chinese Baidu will launch its own chatbot against ChatGPT
• Read also: Artificial intelligence, the new battlefield of the internet giants
• Read also: Google launches, in test, its version of ChatGPT
The Californian company OpenAI launched a conversational robot last November, which has been causing a stir ever since. Easy to use, it produces texts – essays, articles or even poems – on simple request.
For the study, published in the journal PLOS Digital health, researchers from the company AnsibleHealth tested the performance of the software on an exam that medical students in the United States must take, and which questions them on various fields (knowledge science, clinical reasoning, bioethics, etc.).
Called USMLE (United States Medical Licensing Examination), this exam is divided into three parts: the first passed after about two years of study, the second after four years, and the third being required to become a doctor.
ChatGPT was tested on 350 of the 376 questions posted on the USMLE site that were part of the June 2022 exam. Image-based questions had to be removed.
They were presented in three formats: open-ended questions (“What would be the diagnosis for this patient given the information presented?”), multiple-choice questions without justification (“Which is the most appropriate next follow-up step among the following ?”), and multiple choice with justification (What is the most likely reason for the patient’s nocturnal symptoms? Explain your reasoning“).
Two reviewers graded the work, and a third adjudicated the discrepancies between them.
The software obtained a score between 52.4% and 75% of correct answers. Generally, the score needed to pass the exam is 60%.
“ChatGPT is approaching the margin of success,” the study concludes.
Some outside experts have criticized the method used. The researchers could have introduced a degree of anonymization by mixing human responses with those of the robot, said Nello Cristianini, professor of artificial intelligence at the University of Bath in the United Kingdom.
Still, he called the work “part of a series of exciting new developments in the field of artificial intelligence” (AI).
According to Lucia Ortiz de Zarate, a researcher at the Autonomous University of Madrid, this study demonstrates “the potential of AI in the medical field”. It “can be of great help to doctors when they make diagnoses and prescribe treatments,” she said.
At the end of January, another study had shown that ChatGPT could pass the exams of an American university of law – although finishing last in the class.