This video was sent to me by a couple of people today, and shows Rick Rashid of Microsoft Research demonstrating their advances in the field of natural user interface technology and speech recognition.
Here’s the video on Youtube:
Rick first explains the difficulties in voice recognition, while at the same time a transcript of the talk is shown on a screen above – a live demonstration of the technology. The recognition isn’t perfect, and the transcription on the screen has some errors, but not many.
The highlight of the talk comes when the translation aspect is introduced, and the words Rick is saying are not only translated into Chinese on the screen, but are also spoken using a synthesised voice that is based on samples of Rick’s own voice. The idea being that it would sound as if Rick was himself speaking Chinese. It still sounds robotic but it’s amazing that this is the kind of technology that we can expect in coming years.
Anyway, after tweeting the video out, @marcardar (of Hanping fame) reminded me that Google actually has a similar technology built into its Translate service. Just choose which languages you want to translate, enter some text, then press the speaker icon at the bottom of the translated text box. The outcome isn’t always what you’d want, though, as can be seen below – the translation comes out as “Hackers from China”:
There’s also an interesting feature built into the Google Translate app for Android. It’s called Conversation, and allows two people to have a conversation in different languages, using the phone as an intermediary.
To access the feature, open the app on your phone, then from the drop-down menu at the top choose “Conversation”, and also select the languages you will be translating between:
Next, pass the phone to the person will speak first and have them tap the button at the bottom that corresponds to their language. They will be promoted to speak, after which the phone will display the text it thinks that they said in a text box. The text can now be edited using the keyboard if needed.
After pressing enter a speech bubble will be displayed with the original text, and the translated text. Just press the speech bubble to hear the translation spoken aloud by the phone. Then pass the phone to the next person and they can use the same process to respond.
I must admit that I’ve never actually used this feature to have a conversation with someone in a language I couldn’t speak, so it’s difficult for me to accurately rate its success. Though it’s a great feature to show off and was fun to play with messing about in Chinese and English. If you have a success, or failure, story about using this feature of Google Translate I’d love to hear it in the comments.