Deblackboxing Deep Learning Machine Translation

Major AI players in the domain such as Google and Facebook are moving forward to deep learning. Deep learning approaches are especially promising because they learn, and also, they are not fixed for any specific task. Deep learning architectures such as deep neural networks have been applied to fields including natural language processing.

A deep learning machine translation system is simply composed of an “encoder” and a “decoder”. The encoder converts sequence to vector, a kind of language that computer can understand. And the decoder converts vector to sequence, which is the output perceivable by users. The encoder and decoder system are called long short-term memory recurrent neural networks (LSTM-RNN).  The advantage of LSTM-RNN is that it is good at dealing with long sentences. The way that LSTM-RNN deals with complexity of sentences is to associate a word with other words in the input. The main character of deep learning translation approach is that it is based on the hypothesis that words appearing in similar contexts may have a similar meaning. The system thus tries to identify and group words appearing in similar translational contexts in what is called “word embeddings” (Poibeau, 2017). In other word, this approach understands a word by embedding it into the context, which enhances the accuracy of translation.

I would like to use English-Chinese translation as an example. Translating Chinese can often be tricky because it has a different alphabet system with grammar rules and one word can have several different meanings and pronunciations. To solve the problem, Google neural machine translation is relying on eight-layer LSTM-RNNs to have a more accurate translation in terms of the context.

Also, there is an interesting term called “attention mechanism” that is used in deep learning machine translation. By learning a large number of data, this model learns to decide which parts to focus its attention on while generating from one language to another. After this step, this mechanism helps to align input with output. In other word, it helps to make sure that the output represents the major meaning of the input.  However, not every language has the same sequence as English does and this raises the difficulty of alignment. English-Japanese translation can be good example here since Japanese has a quite different syntax system.

Deep learning is a good tool for translation, but it is not perfect. It can still make significant errors that a human translator would never make, like mistranslating proper names or rare terms, and translating sentences in isolation rather than considering the context of the paragraph or page. So there is still a long way to go.



Ethem Alpaydin,  Machine Learning: The New AI. Cambridge, MA: The MIT Press, 2016.

Thierry Poibeau, Machine Translation (Cambridge, MA: MIT Press, 2017).

How Google Translate Works: The Machine Learning Algorithm Explained (Code Emporium). Video.