AI & Humanoids

‘Selective hearing’ headphones: Hear clearly in a crowd with one look

‘Selective hearing’ headphones: Hear clearly in a crowd with one look
Headphones that use AI to single out one voice in a crowded space
Headphones that use AI to single out one voice in a crowded space
View 2 Images
Headphones that use AI to single out one voice in a crowded space
1/2
Headphones that use AI to single out one voice in a crowded space
Off-the-shelf headphones are fitted with microphones and a button
2/2
Off-the-shelf headphones are fitted with microphones and a button

Researchers have used AI attached to off-the-shelf headphones to isolate the voice of one speaker in a noisy crowd just by looking at them. The code for their next-level noise cancelling system is freely available if you want to build your own.

It’s difficult to hear what one person is saying in a crowded, noisy space where lots of other people are speaking. This is especially true for people who are hard of hearing. While modern hearing aids use noise-cancelling technology, they can’t eliminate background noise completely.

University of Washington (UW) researchers have devised a solution to hearing better in a noisy environment. Using run-of-the-mill noise-cancelling headphones fitted with AI, they developed a system that can single out a speaker’s voice just by the wearer looking at them once.

“We tend to think of AI now as web-based chatbots that answer questions,” said Shyam Gollakota, a professor at UW’s Paul G. Allen School of Computer Science and Engineering and a senior author on the study. “But in this project, we develop AI to modify the auditory perception of anyone wearing headphones, given their preferences. With our devices you can now hear a single speaker clear even if you are in a noisy environment with lots of other people talking.”

Off-the-shelf headphones are fitted with microphones and a button
Off-the-shelf headphones are fitted with microphones and a button

The ‘target speech hearing’ (THS) system developed by the researchers is simple but effective. Off-the-shelf headphones are fitted with two microphones, one on each earcup. While looking at the person they want to hear, the wearer presses a button on the side of the headphones once, for three to five seconds. Sound waves from that speaker’s voice reach both microphones simultaneously – there’s a 16-degree margin of error – and are sent to an onboard computer, where machine learning software learns the speaker’s vocal patterns. The speaker’s voice is then isolated and channeled through the headphones, even when they move around, and extraneous noise is filtered out.

The video below shows how effective the headphones are. They quickly filter out environmental noise to focus on the speaker, removing the noise generated by a person speaking on their phone nearby (indoors) and a very noisy outdoor fountain.

AI headphones filter out noise so you hear one voice in a crowd

How fast can the AI process the speaker’s voice and remove unwanted sounds? When tested, the researchers found that their system had an end-to-end latency of 18.24 milliseconds. For comparison, an eye blink lasts between 300 and 400 milliseconds. That means there’s virtually no lag time between looking at someone you want to listen to and hearing only their voice in your headphones; it all happens in real time.

They gave their THS system to 21 subjects, who rated the noise suppression provided by the headphones from real-world indoor and outdoor environments. On average, subjects rated the clarity of the speaker’s voice nearly twice as high as when it wasn’t processed.

Their THS system builds on ‘semantic hearing’ tech the UW researchers had previously developed. Like THS, that technology used an AI algorithm running on a smartphone wirelessly connected to noise-cancelling headphones. The semantic hearing system could pinpoint noises like birdsong, sirens and alarms.

Currently, the new system can only filter one target speaker at a time and only when there is no other loud voice coming from the same direction as the speaker. But if the headphone wearer isn’t happy with the sound quality, they can re-sample the speaker’s voice to improve clarity. The researchers are working on expanding their system to earbuds and hearing aids. And they’ve made their THS code publicly available on GitHub so that others can build on it. The system is not commercially available.

The researchers presented their work earlier this month at the Association of Computing Machinery (ACM) Computer-Human Interaction (CHI) conference on Human Factors in Computing Systems held in Honolulu, Hawai’i, where it received an Honorable Mention. The unpublished research paper is available here.

Source: UW

3 comments
3 comments
BlueOak
This is pretty fascinating stuff. While my hearing - freshly in my 7th decade - is generally fine, I lose the ability to communicate when in a noisy room of people talking. From what I understand, this impaired ability to discriminate voices in a noisy room is fairly common as we age.

Now that the FDA in the US has loosened the standards for “hearing aids”, it seems “over the counter” reasonably priced earbuds that restore this ability should soon be available.
1stClassOPP
I look forward to seeing this technology integrated with current hearing aid technology. It’s frustratingly difficult to pay attention to someone speaking when several voices or noises interfere with your hearing capability.
Rob van Damme
I also would like to see this integrated in hearing aids, as an experienced wearer of hearing aids I am very aware that a sound cancelling headset is not really helpful in combination with hearing aids because they will cause feedback.
So you will have to switch them off and take them out before using the headset.

Also a sound cancelling headset is very tight.