Natural Language Processing + Google Translate

Language translation is more complex than a simple word-to-word replacement method. As seen in the readings and videos for this module, translating a text in another language needs more context than a dictionary can provide. This “context’ in language is known as grammar. Because computers do not understand grammar, they need a process in which they can deconstruct sentences and reconstruct them in another language in a way that makes sense. Words can have several different meanings and also depend on their structure within a sentence to make sense. Natural Language Processing addresses this problem of complexity and ambiguity in language translation.  The PBS Crash Course video breaks down how computers use NLP methods.

Deconstructing sentences into smaller pieces that could be easily processed:

  • In order for computers to deconstruct sentences, grammar is necessary
  • Development of Phrase Structure Rules which encapsulate the grammar of a language

Using phrase structures, computers are able to construct parse trees

*Image retrieved from: https://www.youtube.com/watch?v=fOvTtapxa9c

Parse Trees: link every word with a likely part of speech+ show sentence construction

  • This helps computers process information more easily and accurately

The PBS video also explains this is the way that Siri is able to deconstruct simple word commands. Additionally, speech recognition apps with the best accuracy use deep neural networks. 

Looking at how Google Translate’s Neural Network works, the Code Emportium video describes a neural network as a problem solver. In the case of Google Translate, the neural networks job or problem to solve, is to take an English sentence (input) and turn it into a French translation (output).

As we learned from the data structures module, computers do not process information the way our brains do. They process information using numbers (vectors). So, the first step will always be to convert the language into computer language. For this particular task, a Recurrent Neural Network will be used (neural network specifically for sentences).

Step 1. Take English sentence and convert into computer language (a vector) using a recurrent neural network

Step 2. Convert vector to French sentence (using another recurrent neural network)

Image retrieved from: https://www.youtube.com/watch?v=AIpXjFwVdIE

According to research from a 2014 paper on Neural Machine Translation, the Encoder-Decoder Architecture model pictured above works best for medium length sentences with 15-20 words (Cho et al). The Code Emporium video tested out the LSTM-RNN Encoder method on longer sentences, and found that the translations did not work as well. This is due to the lack of complexity in this method. Recurrent Neural Networks use past information to generate the present information. The video gives the example:

“While generating the 10th word of the French sentence it looks at the first nine words in the English sentence.” The Recurrent Neural Network is only looking a the past words, and not the words that come after the current word. In language both the words that come before and after are important to the construction of the sentence. Therefore, a BiDirectional Neural Network is able to do just this.

Image retrieved from: https://www.youtube.com/watch?v=AIpXjFwVdIE

Bidirectional neural networks (looks at words that come before it and after it) Vs. Neural Network (only looks at words that come before it)

Using the BiDirectional model – which words (in the original source) should be focused on when generating the translation?

Now, the translator needs to learn how to align the input and output. This is learned by an additional unit called an attention mechanism (which French words will be generated by which English words)

This is the same process that Google Translate uses – on a larger scale

Google Translate Process & Architecture / Layer Breakdown

Image retrieved from video: https://www.youtube.com/watch?v=AIpXjFwVdIE

English translation is given to the encoder, which translates the sentence into a vector (each word gets assigned a number), then an attention mechanism is used next to determine the English words to focus on as it generated a French word, then the decoder will translate the French translation one word at a time (focusing on words determined by attention mechanism).

Works Cited

CrashCourse. Data Structures: Crash Course Computer Science #14. YouTube, https://www.youtube.com/watch?v=DuDz6B4cqVc.
CrashCourse. Machine Learning & Artificial Intelligence: Crash Course Computer Science #34. YouTube, https://www.youtube.com/watch?v=z-EtmaFJieY.
CrashCourse. Natural Language Processing: Crash Course Computer Science #36. YouTube, https://www.youtube.com/watch?v=fOvTtapxa9c.
CS Dojo Community. How Google Translate Works – The Machine Learning Algorithm Explained!YouTube, https://www.youtube.com/watch?v=AIpXjFwVdIE.
Thierry Poibeau, Machine Translation (Cambridge, MA: MIT Press, 2017). Selections