Science

New software translates users' speech, using their own voice

New software translates users' speech, using their own voice
New software developed by Microsoft is able to reproduce the user's speech in another language, using their own voice (Image via Shutterstock)
New software developed by Microsoft is able to reproduce the user's speech in another language, using their own voice (Image via Shutterstock)
View 1 Image
New software developed by Microsoft is able to reproduce the user's speech in another language, using their own voice (Image via Shutterstock)
1/1
New software developed by Microsoft is able to reproduce the user's speech in another language, using their own voice (Image via Shutterstock)

For some time now, speech-recognition programs have existed that attempt to reproduce the user's spoken words in another language. Such "speech-to-speech" apps, however, provide their translations using a very flat, synthetic voice. Now, experimental new software developed by Microsoft is able not only to translate between 26 different languages, but it plays the translated speech back in the user's own voice - complete with the inflections they used when speaking in their own language. It looks like a real-life version of Star Trek's universal translator could soon be here.

The system was demonstrated this Tuesday at Microsoft's Redmond, Washington, campus, by its inventor, Microsoft research scientist Frank Soong. He started by using the software to read out Spanish text using the voice of his boss, Rick Rashid, and then proceeded to use it to allow the company's chief research and strategy officer, Craig Mundie, to converse in Mandarin.

So far, the program isn't ready to go as soon as it's been installed. Users must initially spend about an hour with it, training it to recognize and reproduce their voice. Once that's been accomplished, the software applies that user-specific speech model to a generic text-to-speech model for the desired output language. Individual sounds of the user's voice are selected from the training session, then strung together and appropriately altered, in order to create a natural-sounding translation.

It's been suggested that such a system would make users more confident that their speech was being translated accurately, and that fewer misunderstandings would occur due to a lack of context - in other words, it would be more obvious if the speaker was being sarcastic, or exaggerating. It could also help facilitate the learning of foreign languages, as students may find it easier to imitate phrases spoken in their own voice.

Examples of a phrase spoken in different languages via the system can be heard in the link below.

Via: Technology Review

6 comments
6 comments
zekegri
FINALLY! I have been saying this for many years-WOW someone finally is getting it!
Bravo
Renārs Grebežs
Damnit, it still is primitive.. :/
jonoxn
I don't really sound like that do I!? I hope my voice comes out better than it usually sounds on tape.
Chi Sup
color me utterly disappointed by the dismal demo
electric38
Good timing! Now that many countries are offering several levels of multimedia education on line, they can do so in many languages at once. No reason why anyone in the world with a little motivation can't learn the skills that are needed for tomorrows challenges.
Steven Hawes
Haven't programs like this already been around for a decade or so such as Dragon Reader/NaturallySpeaking?