University of Cambridge debuts virtual talking head capable of expressing human emotions
The University of Cambridge in the United Kingdom has unveiled a virtual “talking head” that is capable of expressing a range of human emotions. The lifelike face called Zoe, which the team believes is the most expressive controllable avatar ever created, could in the future be used as a digital personal assistant.
The voice controlled virtual assistant has been a staple of science fiction for decades, from Star Trek and Red Dwarf to Iron Man, it's up there with faster-than-light travel as one of the most mythologized technologies going. Though recent technologies like Apple's Siri and Samsung's S Voice have brought us a little closer to that dream by allowing us to have (fairly) life-like conversations with our smartphones, there's still a palpable sense that we're just talking to lifeless machines.
The University of Cambridge has been working to tackle this exact problem, and while the team's “Zoe” digital talking head may not be completely convincing, it's certainly a step in the right direction.
The team has developed a virtual, controllable avatar that is capable of expressing a range of emotions with what the team believes to be “unprecedented realism.” It works by a user entering a line of text and selecting from a range of sliders that determine emotion. Hit the enter key and Zoe will read the message in however much of a happy, angry or sad tone you desire.
To create the system, the team spent days recording the face and voice of actress Zoe Lister while she recited more than 7000 lines of text in varying emotions. This data was then used to create six basic emotions for Zoe – happiness, sadness, fear, anger, tenderness and neutrality, as well as changeable pitch, speed and depth settings. Combinations of these levels allow for a huge range of emotions, something that has not been possible in other avatars of this type.
The resulting system was tested on volunteers who were able to recognize the emotion being conveyed with an impressive 77 percent success rate. Bizarrely, this was actually higher than the 73 percent recognition rate when the volunteers were shown the real-life Zoe Lister.
The program itself is impressively data light, coming in at just ten megabytes in size. This means that we could well see the technology in future mobile devices such as smartphones and tablets. The template surrounding the technology also has the potential to allow for people to upload their own faces and voices in just a few seconds, rather than the days it took the team to create Zoe.
“It took us days to create Zoe, because we had to start from scratch and teach the system to understand language and expression," Professor Roberto Cipolla from the University's Department of Engineering said. "Now that it already understands those things, it shouldn't be too hard to transfer the same blueprint to a different voice and face.”
The team is working to improve the realism of the technology while exploring real world applications, such as sending friends a digital “face message” that conveys the emotion you're feeling, or helping autistic and deaf children to understand emotions and learn to lip-read.
The impressive (and occasionally terrifying) Zoe can be seen in action below.
Source: University of Cambridge
Please keep comments to less than 150 words. No abusive material or spam will be published.
For me the head movement is the next big bump to smooth over.
I find it too heavily mechanically derived from every individual syllable.
Individual syllables should contribute only a secondary element to the modulation of the head movement – the primary head movement modulation needs to come more from the points of emphasis in PHRASES, perhaps peaking on the adjectives, or the verbs when there are no adjectives
The average position that the head returns to must never be locked on dead centre. Mood should also modulate the average position - and the intensity of the modulation.
I appreciate the achievement and realize that these are early days yet.
Anybody else agree with me?
Easy to criticise when there is something there - hats off to you guys - keep going!