Eavesdropping AI detects the tone of conversations

By Rich Haridy

February 01, 2017

Researchers from MIT have developed an AI system that can detect the tone of a conversation

Jason Dorfman/MIT CSAIL

View 3 Images

1/3

The system tracks a conversation in real-time across five-second intervals and evaluates whether the mood is happy, sad or neutral

MIT

2/3

Researchers from MIT have developed an AI system that can detect the tone of a conversation

Jason Dorfman/MIT CSAIL

3/3

PhD candidate Mohammad Ghassemi (left) and graduate student Tuka Alhanai have developed a system that can detect the tone of a conversation using a wearable device

Jason Dorfman/MIT CSAIL

View gallery - 3 images

MIT researchers have developed an system that makes use of a wearable device to detect whether the tone of a conversation is happy, sad or neutral. For those with Asperger's, or other conditions that make it difficult to understand regular social cues, this could offer a future where a digital social coach in the pocket could help relieve anxiety.

The prototype system utilizes a Samsung Simband to collect physiological data, such as movement, heart rate, blood pressure, and skin temperature, in real time. The system also captures the audio of a given conversation to analyze the speaker's tone, pitch, energy and vocabulary, with a neural network algorithm then processing the mood of a conversation across five-second intervals.

The AI correlated long pauses and monotonous vocal patterns with sad stories and energetic, varied tones with happy stories. Body language was also integrated into the results with movements such as raising one's hands to their face or increased fidgeting being associated with sadder stories. The researchers claim the system determined the overall tone of a given conversation with an accuracy rate of 83 percent.

The most significant innovation the team developed was the ability for the AI to classify the emotional tone of a conversation in real-time, as PhD candidate Mohammad Ghassemi explains, "This is the first experiment that collects both physical data and speech data in a passive but robust way, even while subjects are having natural, unstructured interactions."

The development of the technology is still in its nascent stages, though, with any broader implementation set to face some pretty major hurdles, primarily in terms of privacy. For the device to be completely functional it would need to analyze physical data from both sides of the conversation, as well as being able to record and monitor conversations in real-time. It's currently illegal in many places to record a conversation without prior consent, so any real-life consumer application would need to traverse some tricky privacy issues.

The team still has plenty of development work planned with the study's co-author, graduate student Tuka Alhanai, saying, "Our next step is to improve the algorithm's emotional granularity so that it is more accurate at calling out boring, tense, and excited moments, rather than just labeling interactions as 'positive' or 'negative'."

The researchers are set to present their study at the upcoming Association for the Advancement of Artificial Intelligence (AAAI) conference in San Francisco.

Take a look at how the system classifies the tone of a conversation in real-time in the video below.

Source: MIT