(Denver) Stutterers and deaf-mutes struggle to make themselves understood by artificial intelligence. At the last meeting of the American Association for the Advancement of Science (AAAS), in February in Colorado, researchers described the obstacles for these and other minority groups.
“Stuttering is relatively rare and has a lot of variation,” said Shaomei Wu, a California researcher who founded the technology equity firm Almpower. “But three million Americans suffer from it. Like other language disorders, stuttering prevents many people from taking advantage of advances in voice recognition technology. For example, there can be significant delays when calling 911. This is a real problem. »
Mme Wu was joined in the AAAS session on artificial intelligence (AI) and language disorders by Abraham Glasser, a computer scientist at Gallaudet University in Washington who specializes in AI recognition of language. signs, and by Hannah Rowe, a speech therapist at Boston University who works on neurological speech disorders. Mme Wu is a stutterer and Mr. Glasser is deaf and gave his presentation in sign language, with oral transcription by a human interpreter. Gallaudet University serves deaf and hard of hearing students.
“The main problem is that there are not enough language databases to train AI to recognize the speech of stutterers,” explains M.me Wu.
Stuttering generally has three aspects, according to Mme Wu: repetition of words, lengthening of certain syllables and an abnormally long delay between two words.
Only one speech recognition tool for stutterers exists, according to Mme Wu, a Meta pilot that makes a lot of mistakes.
Otherwise, there are half a dozen vocal stuttering databases, notably from Google and Apple, but we do not exceed around fifty hours of recording. It takes thousands of hours to train voice recognition AI software.
Shaomei Wu, Californian researcher
Dysarthria
Mme Rowe specializes in recognizing the voices of patients with dysarthria. It is a speech disorder caused by problems with muscle power and control linked to neurological disorders such as Parkinson’s, stroke and amyotrophic lateral sclerosis.
“These are patients who cannot benefit from the benefits of Alexa or Siri voice assistants, even though they often have physical motor problems that would make these technologies particularly useful,” says M.me Rowe. There is a lot of variability between patients, so we have not yet succeeded in generating language databases to train AI software. »
In patients with mild dysarthria, software makes 10% errors, compared to less than 1% for average speakers. At a moderate stage of dysarthria, the error rate jumps to 35% and at a severe stage, to 80%.
One of the avenues being considered is to have individual voice recognition models. “In the laboratory, we have managed to reduce the error rate to 5% for recognizing the voice of patients with severe dysarthria,” says M.me Rowe. But it takes dozens of hours of recordings associated with transcriptions. You have to validate the transcriptions with the patients, it’s very cumbersome. Especially since these are patients who tire more easily, for whom it is difficult to speak. »
Mme Rowe is therefore working on the characterization of different dysarthrias to reduce the number of hours of recordings necessary for personalized voice recognition models. “We have published a few proposals in recent years. »
Parkinson’s, for example, has imprecise consonants as a characteristic. If the AI speech recognition software takes this into account, it will learn more quickly to recognize the speech of a specific patient.
Hannah Rowe, Boston University speech pathologist
In the magazine Frontiers in Computer Science in 2022, Mme Rowe outlined his work plan: identify the sources of diversity between individuals in motor speech disorders and their potential impact on speech recognition by AI. “With this data, normally we will accelerate the individual training of software by a patient,” she says.
Deafness
Deaf and hard of hearing people who are not mute also have problems with voice recognition, Glasser said. “If it is a birth disorder, the intonation is not the same as for the average population. Voice recognition AI makes a lot of mistakes. This means that this population uses this technology very little. »
When it comes to the recognition of sign language, a very concrete problem is posed by the protection of privacy. “You have to see the person in the AI training video. With voice, it’s more anonymous. So sign language recognition software is not progressing. »
Another avenue Mr. Glasser is exploring is an avatar that would allow deaf people to have audio conversations with lip reading. “But the creation of an avatar whose lips move like a speaking human is not yet ready. »
Listen to two patients with amyotrophic lateral sclerosis (in English)
Learn more
-
- 15 million
- Number of patients suffering from dysarthria worldwide
Source: Boston University
- 70 million
- Number of stutterers in the world
Source: Almpower