It’s difficult to hear what one person is saying in a crowded, noisy space where lots of other people are speaking. This is especially true for people who are hard of hearing. While modern hearing aids use noise-cancelling technology, they can’t eliminate background noise completely.
University of Washington (UW) researchers have devised a solution to hearing better in a noisy environment. Using run-of-the-mill noise-cancelling headphones fitted with AI, they developed a system that can single out a speaker’s voice just by the wearer looking at them once.
“We tend to think of AI now as web-based chatbots that answer questions,” said Shyam Gollakota, a professor at UW’s Paul G. Allen School of Computer Science and Engineering and a senior author on the study. “But in this project, we develop AI to modify the auditory perception of anyone wearing headphones, given their preferences. With our devices you can now hear a single speaker clear even if you are in a noisy environment with lots of other people talking.”
The ‘target speech hearing’ (THS) system developed by the researchers is simple but effective. Off-the-shelf headphones are fitted with two microphones, one on each earcup. While looking at the person they want to hear, the wearer presses a button on the side of the headphones once, for three to five seconds. Sound waves from that speaker’s voice reach both microphones simultaneously – there’s a 16-degree margin of error – and are sent to an onboard computer, where machine learning software learns the speaker’s vocal patterns. The speaker’s voice is then isolated and channeled through the headphones, even when they move around, and extraneous noise is filtered out.
The video below shows how effective the headphones are. They quickly filter out environmental noise to focus on the speaker, removing the noise generated by a person speaking on their phone nearby (indoors) and a very noisy outdoor fountain.
How fast can the AI process the speaker’s voice and remove unwanted sounds? When tested, the researchers found that their system had an end-to-end latency of 18.24 milliseconds. For comparison, an eye blink lasts between 300 and 400 milliseconds. That means there’s virtually no lag time between looking at someone you want to listen to and hearing only their voice in your headphones; it all happens in real time.
They gave their THS system to 21 subjects, who rated the noise suppression provided by the headphones from real-world indoor and outdoor environments. On average, subjects rated the clarity of the speaker’s voice nearly twice as high as when it wasn’t processed.
Their THS system builds on ‘semantic hearing’ tech the UW researchers had previously developed. Like THS, that technology used an AI algorithm running on a smartphone wirelessly connected to noise-cancelling headphones. The semantic hearing system could pinpoint noises like birdsong, sirens and alarms.
Currently, the new system can only filter one target speaker at a time and only when there is no other loud voice coming from the same direction as the speaker. But if the headphone wearer isn’t happy with the sound quality, they can re-sample the speaker’s voice to improve clarity. The researchers are working on expanding their system to earbuds and hearing aids. And they’ve made their THS code publicly available on GitHub so that others can build on it. The system is not commercially available.
The researchers presented their work earlier this month at the Association of Computing Machinery (ACM) Computer-Human Interaction (CHI) conference on Human Factors in Computing Systems held in Honolulu, Hawai’i, where it received an Honorable Mention. The unpublished research paper is available here.
Source: UW
Now that the FDA in the US has loosened the standards for “hearing aids”, it seems “over the counter” reasonably priced earbuds that restore this ability should soon be available.
So you will have to switch them off and take them out before using the headset.
Also a sound cancelling headset is very tight.