Science

Advanced AI system converts brain signals into speech

Advanced AI system converts br...
Researchers have developed an algorithm that can translate neural signals from the auditory parts of the brain into synthesized speech
Researchers have developed an algorithm that can translate neural signals from the auditory parts of the brain into synthesized speech
View 1 Image
Researchers have developed an algorithm that can translate neural signals from the auditory parts of the brain into synthesized speech
1/1
Researchers have developed an algorithm that can translate neural signals from the auditory parts of the brain into synthesized speech

In a remarkable landmark breakthrough, scientists have demonstrated a computer system effectively translating brain signals into intelligible speech. The extraordinary experiment presents a proof-of-concept that could pave the way for a large variety of brain-controlled communication devices in the future.

A huge hurdle neuroengineers face on the road to effective brain-computer interfaces is trying to translate the wide array of signals produced by our brain into words and images that can be easily communicable. The science fiction idea of being able to control devices or communicate with others just by thinking is slowly, but surely, getting closer to reality.

Recent advances in machine learning technology have allowed scientists to crunch masses of abstract data. Just last year a team of Canadian researchers revealed an algorithm that could use electroencephalography (EEG) data to digitally recreate faces that a test subject had been shown.

Translating brainwaves into words has been another massive challenge for researchers, but again, with the aid of machine learning algorithms, amazing advances have been seen in recent years. The latest leap forward, from a team of American neuroengineers, has revealed a computer algorithm than can decode signals recorded from the human auditory cortex and translate them into intelligible speech.

The study first gathered data from five patients while they were undergoing neurosurgery for epilepsy. The patients had a variety of electrodes implanted into their brains allowing the researchers to record comprehensive electrocorticography measurements while the patients were listening to short continuous stories spoken by four different speakers. Due to the invasive nature of needing to gather this data while patients were undergoing brain surgery only around 30 minutes of neural recordings could be gathered from each individual.

"Working with Dr. Mehta [the neurosurgeon performing the procedure], we asked epilepsy patients already undergoing brain surgery to listen to sentences spoken by different people, while we measured patterns of brain activity," explains Nima Mesgarani, senior author on the new study. "These neural patterns trained the vocoder."

To test the efficacy of the algorithm, the system was asked to decode voices counting from zero to nine that were not included in the original training data. As the speakers recited the digits, the brain signals of the patients was recorded and run through the vocoder. A neural network then analyzed and cleaned up the output produced by the vocoder.

"We found that people could understand and repeat the sounds about 75 percent of the time, which is well above and beyond any previous attempts," says Mesgarani. "The sensitive vocoder and powerful neural networks represented the sounds the patients had originally listened to with surprising accuracy."

Mesgarani readily admits to Inverse, it may be at least a decade before this technology actually becomes realistically available. After all, we can't easily implant a vast array of electrodes into our brain to record these neural signals. However, as a proof-of-concept, this research is somewhat groundbreaking, proving that signals processed by a human auditory cortex can be decoded into speech. If these cursory results can be produced from such a small dataset then one can only imagine what could be generated from larger volumes of data.

The next step for Mesgarani and his team is to refine the algorithms to see if more complex words and sentences can be decoded from the same auditory neural data. Following on from that, the goal would be to move from simply decoding aural data to finding accurate neural data that can translate the act of imagining speaking into synthesized words.

"In this scenario, if the wearer thinks 'I need a glass of water,' our system could take the brain signals generated by that thought, and turn them into synthesized, verbal speech," says Mesgarani. "This would be a game changer. It would give anyone who has lost their ability to speak, whether through injury or disease, the renewed chance to connect to the world around them."

The new study was published in the journal Scientific Reports.

Source: The Zuckerman Institute, Columbia University

5 comments
guzmanchinky
My mind immediately goes to eliminating the court system. Hopefully someday crime will be impossible to hide due to brain scans or technology like this which will reveal someone's guilt (or innocence) without possible concealment.
Paul Robertson
Dog and baby translator, in that order. There's your market:)
Brendan Dunphy
Not really, they have 'only' managed to convert heard statements into speech. A big step maybe but converting thoughts is another thing altogether.
noteugene
It'd be great for the hearing impaired. We'd finally be able to call the police station, hospital, tow truck....nah, never happen. People would still say we'd have to use a TDY I'd we wanted to talk to them.
Jean Lamb
"Please change the bleeping channel!" would be the first words from some people...