Interpreting and translating live speech is much trickier than simply processing written text. Indeed, unlike human brains, machines would typically need to go through three separate phases to convert oral communication from one language to another. Initially, speech would need to be interpreted by the machine and transcribed into text, which would then be translated into the target dialect, before being fed into a text-to-speech engine to be spoken out loud. Although this cascaded process is transparent fo
Source: Google introduces ‘Translatotron’, a direct speech-to-speech translation technology