“Challenges in natural language processing frequently involve speech recognition, natural language understanding, and natural-language generation”(Wikipedia). I started weekly readings with the question related to patterner cognition I had proposed last week- how could the computer tell the difference between characters in Japanese and characters in Chinese in Machine Translation (MT) without their codes. I ignore the fundamental rules in natural language to achieve language understanding with only the de-blackboxing of Optical Character Recognition (OCR) and Convolutional Neural Networks (ConvNet). Natural languages have two components- “tokens (smallest units of language) and grammar (defines the ordering of tokens)” (CS Dojo Community, 2019). In the processes of OCR, the computer would easily tell the difference between characters in Japanese and characters in Chinese since two languages have distinct grammar/Phase Structure Rules. There are examples in two languages with the same meaning- I’ll go to the school-:
Although they have the same tokens for “school” (学校), they work in different grammars. In Chinese, the syntax goes with the order- subject→ predicate→ object. While in Japanese, the syntax goes with the order- subject→ object→ predicate (with habitual omit of the first person- I). In other words, for Parse Trees of two languages, verbal phrases will appear in the last part of sentences in Japanese, while in Chinese, they will appear in the middle of sentences. To reach coherent syntaxes and semantics in MT, the computer will recognize two different languages in the paragraph effortlessly.
In the progress of natural language understanding and natural language generation, an Encoder-Decoder Architecture with Recurred Neural Network (RNN) is used. (CS Dojo Community, 2019) The computer firstly converts a source language sentence into a vector with a current RNN. Then, to convert this vector into a target language sentence, another neural network will be introduced. The important point is, for perfection, algorithms need to have a huge amount of datasets to train themselves.
Moreover, we learned some in previous sections for speech recognition- the computer could decode and encode the sound waves by assigning values to acoustic signals vertically (according to the amplitude). To make it more obvious, we will use spectrograms to visualize it. As each language has a certain amount of sounds, it could be trained, again, using DL algorithms. (CrashCourse#36)
According to professor Liang, “there’s probably a qualitative gap between the way that humans understand language and perceive the world and our current models.” Is this gap could be closed only with more advanced techniques? Or do we still need a new linguistic theory to systematize human language? Is the current mainstream linguistic theory enough for NLP?
CrashCourse. 2017. Natural Language Processing: Crash Course Computer Science #36. https://www.youtube.com/watch?v=fOvTtapxa9c.
CrashCourse. 2017. Machine Learning & Artificial Intelligence: Crash Course Computer Science #34. https://www.youtube.com/watch?v=oi0JXuL19TA.
CS Dojo Community. 2019. How Google Translate Works – The Machine Learning Algorithm Explained! https://www.youtube.com/watch?v=AIpXjFwVdIE.
“Machine Translation.” 2021. In Wikipedia. https://en.wikipedia.org/w/index.php?title=Machine_translation&oldid=999926842.
“Natural Language Processing.” 2021. In Wikipedia. https://en.wikipedia.org/w/index.php?title=Natural_language_processing&oldid=1009043213.
“The Technology behind OpenAI’s Fiction-Writing, Fake-News-Spewing AI, Explained.” n.d. MIT Technology Review. Accessed March 6, 2021. https://www.technologyreview.com/2019/02/16/66080/ai-natural-language-processing-explained/.