Losing the ability to speak after a brain injury or due to disease leaves a person unable to express their thoughts, feelings and ideas, and can be incredibly isolating. In two recently published studies, researchers demonstrated how combining brain implants and AI gave two women – one paralyzed after a stroke, the other with a progressive neurodegenerative disorder – a voice.
Stroke can damage the regions of the brain that control language and speech. Similarly, amyotrophic lateral sclerosis (ALS), a progressive neurodegenerative disease that attacks the neurons controlling muscles, can cause speech problems when it affects the muscles that move the lips, tongue, soft palate, jaw, and voice box. But researchers have found a way to give the voiceless their voice back.
Brain-computer interfaces (BCI) are here, and they’re only getting better. Two studies published on the 23rd of August in the journal Nature have proven how far we’ve come with our ability to translate thoughts into spoken words. The first, by researchers at UC San Francisco (UCSF) and UC Berkeley, enabled a woman who’d suffered a stroke to speak and express emotion via a digital avatar. The second, by Stanford Medicine, converted the brain activity of a woman who’d lost the ability to speak due to ALS into text displayed on a computer screen.
Ann’s story
Ann had a brainstem stroke at 30 that left her severely paralyzed and caused severe weakness in her facial and vocal muscles. Before her stroke, she was a high school math teacher in Canada.
After years of rehabilitation, Ann learned to communicate by painstakingly typing one letter at a time on a computer screen. Then, in 2021, she read about how researchers from UCSF had allowed a paralyzed man named Pancho, who’d also suffered a brainstem stroke, to translate his brain signals into text as he attempted to speak.
Now 47, Ann has helped the same team, in collaboration with researchers at UC Berkeley, develop a way of communicating more naturally, using a digital avatar that uses AI to turn brain signals into speech and facial expressions.
“Our goal is to restore a full, embodied way of communicating, which is the most natural way for us to talk with others,” said Edward Chang, corresponding author of the study. “These advancements bring us much closer to making this a real solution for patients.”
With Ann, the researchers wanted to improve upon what they’d achieved with Pancho. They implanted a paper-thin rectangle of 253 electrodes onto the surface of her brain over critical speech-related areas, which, if not for the stroke, would’ve animated the muscles in Ann’s lips, tongue, jaw, and voice box. The electrodes were connected by a cable to a bank of computers.
Ann worked with the researchers to train the system’s AI algorithms to recognize her unique brain signals. For weeks, she repeated different phrases from a 1,024-word conversational vocabulary. Instead of training the AI to recognize whole words, the researchers created a system that decodes words from smaller components or phonemes. This way, the AI only needed to learn 39 phonemes to decipher any English word.
“The accuracy, speed and vocabulary are crucial,” said Sean Metzger, who helped develop the text decoder and is the study’s lead author. “It’s what gives Ann the potential, in time, to communicate almost as fast as we do, and to have much more naturalistic and normal conversations.”
Next, the researchers devised an algorithm for synthesizing speech, which they personalized by using a recording of Ann speaking at her wedding, and used software to create an avatar. The team created customized machine-learning processes that allowed the avatar software to integrate with the signals Ann’s brain sent as she was trying to speak, moving the avatar’s face and displaying emotions like happiness, sadness and surprise.
“We’re making up for the connections between her brain and vocal tract that have been severed by the stroke,” said Kaylo Littlejohn, one of the study’s co-authors. “When Ann first used this system to speak and move the avatar’s face in tandem, I knew that this was going to be something that would have a real impact.”
The BCI system was able to decode a large vocabulary and turn it into text at a median rate of 78 words per minute, with a median word error rate of 25%. The speed of natural conversation among English speakers is around 160 words per minute.
Ann says she’s found helping develop the technology a life-changing experience.
“When I was at the rehab hospital, the speech therapist didn’t know what to do with me,” she typed. “Being a part of this study has given me a sense of purpose, I feel like I am contributing to society. It feels like I have a job again. It’s amazing I have lived this long; this study has allowed me to really live while I’m still alive!”
The researchers are working on creating a wireless version of the system that wouldn’t require Ann to be physically connected to the BCI. They hope that their study will lead to an FDA-approved system that enables brain signal-to-speech communication in the near future.
The video below, produced by UCSF, demonstrates how the technology works as Ann converses with her husband, Bill.
Pat’s story
Sixty-eight-year-old Pat was diagnosed with ALS in 2012. Unlike the usual presentation of ALS, where deterioration begins in the spinal cord and affects the limbs, hers started in her brain stem. This means she can still move around, dress herself and use her fingers to type, but can’t use the muscles associated with speech to clearly enunciate phonemes.
Thankfully, Pat’s brain can still formulate directions for generating phonemes, which is what researchers at Stanford Medicine capitalized on. In March 2022, a neurosurgeon implanted two tiny sensor arrays on the surface of Pat’s brain in two separate speech-production regions. Each array contains 64 electrodes which penetrate the cerebral cortex to a depth roughly equal to the height of two stacked US quarters (3.5 mm).
As with Ann, AI was trained to distinguish the brain activity associated with Pat’s attempts to formulate each of the 39 phonemes that comprise spoken English. Pat had about 25 training sessions, where, in each session, she attempted to repeat 260 to 480 randomly chosen sentences from a large data set of samples of phone conversations.
“This system is trained to know what words should come before other ones, and which phonemes make what words,” said Francis Willett, the study’s lead author. “If some phonemes were wrongly interpreted, it can still take a good guess.”
The BCI achieved a 9.1% word rate error on a vocabulary of 50 words, with the error rate increasing to 23.8% on a 125,000-word vocabulary. It could convert Pat’s attempted speech at a rate of 62 words per minute.
“We’ve shown you can decode intended speech by recording activity from a very small area on the brain’s surface,” said Jaimie Henderson, the neurosurgeon who performed Pat’s surgery and one of the study’s co-authors.
Pat hopes that BCI-assisted communication will soon be available to more people.
“These initial results have proven the concept, and eventually, technology will catch up to make it easily accessible to people who cannot speak,” Pat wrote. “For those who are nonverbal, this means they can stay connected to the bigger world, perhaps continue to work, maintain friends and family relationships.”
Although not yet commercially available, the researchers see the vast potential of their device.
“This is a scientific proof of concept, not an actual device people can use in everyday life,” Willett said. “But it’s a big advance toward restoring rapid communication to people with paralysis who can’t speak.”
The video below, produced by Stanford Medicine, shows how the proof-of-concept BCI enables Pat to speak and their plans for the future use of the device.
Both studies were published in the journal Nature. The study by UCSF can be found here; the Stanford Medicine study can be found here.
Sources: UCSF, Stanford Medicine