Smart headphones use AI to follow conversations in noisy rooms

December 11, 2025

By following the rhythm of a conversation, the "proactive hearing assistant" headphones are able to determine which person the user is speaking to within a noisy environment, then isolate and boost that person's voice

Hu et al./EMNLP

View 1 Image

1/1

Hu et al./EMNLP

Individuals with limited hearing struggle in situations where multiple people around them are speaking at once. New headphone tech could help, by boosting the voice of the person they're talking to based on the rhythm of the conversation.

Conventional hearing aids are typically stymied by the "cocktail party" effect, wherein they can't amplify one person's voice without also boosting the voices of everyone else in the room. If you're a hearing aid user in a group of several people who are simultaneously talking back and forth overtop of one another, this can make for a very frustrating experience.

In recent years, scientists at the University of Washington have set out to address that problem by developing headphones that isolate the voice of whoever the wearer is looking at, and that create a "sound bubble" which tunes out voices more than a few feet away.

The researchers' latest innovation, however, doesn't require the user to be looking at their conversational partner, nor is thwarted by other people who may be speaking within the sound bubble. It utilizes two AI systems, running on an off-the-shelf set of noise-cancelling headphones equipped with binaural microphones.

One of those systems initially sets the user's voice as an "anchor," then detects the voices of other people in the immediate area. It's soon able to determine which of those people the user is talking to, as there will be very little overlap between the speech of that person and the user – after all, they're taking turns speaking back and forth.

At that point, the other AI system takes over. It isolates the person's voice from the others and amplifies it, playing it back through the headphones for the user. There is a slight lag in playback, but it's reportedly minimal. In fact, the system can handle a conversation with up to four people (plus the user) at once.

Although the technology is currently being demonstrated in a set of over-the-ear headphones, the scientists hope that it could ultimately be incorporated into earbuds or a hearing aid. It has so far been tested on English, Mandarin and Japanese dialog – its effectiveness on other languages has yet to be determined.

“Everything we’ve done previously requires the user to manually select a specific speaker or a distance within which to listen, which is not great for user experience,” said doctoral student Guilin Hu, lead author of the study. “What we’ve demonstrated is a technology that’s proactive – something that infers human intent non-invasively and automatically.”

A paper on the research, which was led by Prof. Shyam Gollakota, was recently presented at the Conference on Empirical Methods in Natural Language Processing in Suzhou, China. You can see and hear a demo of the technology in a video via the link below.

Source: University of Washington