Scientists have developed an implant that interprets signals in the brain and converts them into understandable, synthesized speech. This innovative piece of technology could one day give a voice to people otherwise unable to communicate.
It's the work of researchers from the University of California, San Francisco, who trained their model on five volunteers reading passages from children's stories. Electrodes were used to measure brain voltages during the storytelling, and those patterns were then matched to speech.
The crucial part of this particular approach is that rather than attempting to read thoughts, it aims to read the brain's efforts to control the lips, jaw and larynx. Even people who have lost the ability to speak can think about forming their mouths to produce words, and that's the signalling this new device taps into.
While the system is far from perfect at the moment – an average of 69 percent of words can be properly understood, given 25 options to pick from – the focus on simulating the vocal tract has lead to a significant step forward in accuracy.
"The relationship between the movements of the vocal tract and the speech sounds that are produced is a complicated one," says lead researcher Gopala Anumanchipalli. "We reasoned that if these speech centers in the brain are encoding movements rather than sounds, we should try to do the same in decoding those signals."
What Anumanchipalli and his colleagues ended up with are two neural networks: one to match brain signals with movements of the vocal tract, and one to turn those movements into synthesized speech.
As with previous attempts to turn brain patterns into intelligible sounds, short words and simple sentences work best. As the sentences get more complicated, the new synthesized translator doesn't work quite as well – though it still represents impressive progress.
Higher-density electrode arrays for monitoring, and more complex machine learning algorithms in the system itself could both boost the accuracy of the translations. Encouragingly, the new implant showed signs of being able to understand sentences that hadn't been included in the training data.
What's more, the scientists noticed some overlap between patterns from different participants in the study – that should make it easier and quicker for systems like this to be trained up on new people.
"We still have a ways to go to perfectly mimic spoken language," says another member of the team, Josh Chartier. "We're quite good at synthesizing slower speech sounds like 'sh' and 'z' as well as maintaining the rhythms and intonations of speech and the speaker's gender and identity, but some of the more abrupt sounds like 'b's and 'p's get a bit fuzzy.
"Still, the levels of accuracy we produced here would be an amazing improvement in real-time communication compared to what's currently available."
It may be some time before an implant device like this is ready to give a artificially generated voice to those who have lost the natural ability to speak, but it's a very encouraging breakthrough.
A paper on the research has been published in the journal Nature. You can listen to some of the generated sounds in the video below.
Source: University of California, San Francisco