Author Archives: Jianning Wu

AI Algorithms Interpretation- Possible or Not

Jillian Wu

Abstract:
How to interpret algorithms is a rising problem. Governments, organizations, and individuals are all involved in it. This article is motivated by GDPR and attempts to discuss if all algorithms are explainable and understandable. It will introduce different methods to interpret AI programs and discuss the black box that cannot be explained in different algorithms.  

Introduction 

In recent years, AI has developed extremely quickly. Combining with advanced applications in Big Data (BD), Computer Vision (CV), Machine Learning (ML), and Deep Learning (DL), AI/algorithms have made machines/computers capable. It can drive cars, detect faces, interact with humans. However, for the high speed, there are lots of problems with algorithms. There are cases about algorithms related to financial frauds, racism, and violating privacies all over the world. In China, it is reported that users will get a cheaper deal when they order rides online using Android mobiles instead of Apple ones (Wu, 2021). Dark-skinned people are provided fewer services than white people, whereas white people have lower health risks (Ledford, 2021). A self-driving Uber crashed and killed a pedestrian (Gonzales, 2019). Those algorithms harm both individuals and the society. People do not see principles of how programs work. Terms in the AI domain are also complicated and chaotic. Therefore, people start being afraid of AI and put it in an overstate status. The reasons are pretty complicated. It may because the big companies intend to maintain their monopolies, or media agencies trying to take profits from it, or people have so many fantasies about futures and artificial intelligence (though current AI is still in the weak AI phase and far from strong AI like what is performed in movies). How to deal with potential threats? Governments can help. European Union acts first towards those issues. It issued General Data Protection Regulation (GDPR) in May 2018, which has made seven principles to regulate future AI, including the Right of Explanation of Automated Decision (Vincent, 2019).

It will focus on the Explanation of Automated Decision. With the regulation, people could interpret how AI algorithms work and why the data fed to AI systems perpetuate biases and discrimination. If the decision is explainable, it will be easier to assess algorithms’ pros, cons, and risks, so that people can decide to what extent and on what occasions algorithms can be trusted. Also, practitioners will know what aspects they should work on to improve algorithms.

What is Interpretability?

There is no universal definition for interpretability. According to “interpretability is the degree to which a human can understand the cause of a decision” (Miller, 2018). It also said that “interpretability is the degree to which a human can consistently predict the model’s result” (Kim et al., 2016).

People used to instinctively absorb data from the real world and process them employing brains. With those activities, the human could easily understand the world and make decisions. 

However, with human development, the requirement of data is significantly increasing – data are becoming massively. Tools to collect them and deal with them follow up – humans get computers. Therefore, the interpretation progress has changed. People primarily digitalize those data and then utilize algorithms (the black box) to process them. Even though human beings are always the destination (Molnar, 2019), people are eager to know what happened in the black box. What are its logics to make decisions? (Humans can explain their logic to make decisions). They prefer all algorithms interpretable. Nevertheless, for now, people cannot explain all the black boxes. The reasons that many models and applications are uninterpretable come from the insufficient understanding of tasks and targets. As long as the modeler learns more about the mission of the algorithms, the algorithms will perform better.

How to interpret Algorithms?

Since algorithm interpretability is a relatively new topic, researchers have built different systems and standards for assessing it. It introduces two classifications here. The first one has three stages of interpreting, according to Kabul. This one is more friendly to people who are not experts in AI and ML.

 1) Pre-modeling

“Understanding your data set is very important before you start building models” (Kabul, 2017). Interpretable methods before modeling mainly involve data preprocessing and data cleansing. Machine learning is designed to discover knowledge and rules from data. The algorithm will not work well if the modeler knows little about the data. The key to interpreting before modeling is to comprehensively understand the data distribution characteristics, thereby helping the modeler consider more about potential problems and choose the most reasonable model or approach to get the best possible solution. Data visualization is an effective pre-modeling interpretable method (Kabul, 2018). Some may regard data visualization as the last step in data mining to perform the analysis and mining results. However, when a programmer starts an algorithm project, data is the first and most important. It is necessary to establish a sound understanding of the data by visualization methods, especially when the data volume is large or the data dimension is wide. With visualization, the programmer will fully understand the data, which is highly instrumental for the following coding.

2) Modeling

Kabul categorizes models as “white box (transparent) and black box (opaque) models based on their simplicity, transparency, and explainability” (Kabul, 2017). It is easier to interpret white box algorithms like Decision trees than to interpret black box algorithms like deep neural networks since the latter have many parameters. In Decision tree models, the data movement could be clearly traced. People could easily build the accountability system in this program. For example, it can clearly see that “there is a conceptual error in the “Proceed” calculation of the tree shown below; the error relates to the calculation of ‘costs’ awarded in a legal action”(Wikipedia, n.d.) .

 

3) Post-modeling

Explanation in this stage is utilizable to “inspect the dynamics between input features and output predictions” (Kabul, 2017). Also, since the distinct rules of models, the interpretation methods are model-based. It is basically same with the post-hoc interpretability, so it will further discuss in the following section. 

The second obtains two groups of interpretability techniques and “distinguishes whether interpretability is achieved by restricting the complexity of the machine learning model (intrinsic) or by applying methods that analyze the model after training (post hoc)” (Molnar, 2019), which could be further divided (Du et al., 2019). It is more professional in explaining algorithms.

1) Intrinsic interpretability

It combines interpretability with algorithms themselves. The self-explanatory model is embedded in their structures. It is simpler than post-hoc one, which includes programs like the decision tree and rule-based model (Molnar, 2019), which are explained in former section. 

2) Post-hoc interpretability

Post-hoc interpretability is flexible. Programmers can use any preferred method to explain different models. It has multiple explanations for the same model. Therefore, there are three advantages of post-hoc interpretability: a) the explanatory models could be applied in different DL models; b) it can get more comprehensive interpretations for certain learning algorithms; c) it could use with all forms like vectors (Ribeiro et al., 2016). However, it has shortcomings as well. “The main difference between these two groups lies in the trade-off between model accuracy and explanation fidelity” (Du et al., 2019). By means of external models/structures will be not only arduous but lead to potential fallacies. The typical example is Local Interpretable Model-agnostic Explanations (LIME). It is a third-party model to explain DL algorithms with focusing on “training local surrogate models to explain individual predictions” (Ribeiro et al., 2016b). In LIME, the modeler will change the data input to analyze how predictions will change accordingly. For a diagnosis DL program. LIME may delete some data columns to see whether the results are different from human decision. If the results changed,  the changed data may vital for the algorithms, vice versa. It can also be used for tabular data, text and images, so it is popular in recent. Nonetheless, it is not perfect. Some argue that it only helps practitioner to pick better data. The method used in LIME – Supervised Learning does useless job. It cannot know how decisions are make and how decisions incentivize behaviors. 

 

Although the methods to explain ML programs are currently booming, it is still difficult to interpret some Deep Learning algorithms, especially Deep Neural Network (DNNs) algorithms.This fact relates to Reframing AI Discourse. ‘Machine autonomy’ is not equal to human autonomy. Although designers set patterns for the AI system, the AI will become an entity (run by rules that may be unexpected when encountering real problems). This kind of entity does not mean AI can determine where it will go by itself but become an independent program if there is no intervention.

Black box in DNNs Algorithms

After the EU released GDPR, Pedro Domingos, the professor of Computer Science in UW, said on Twitter that “GDPR makes Deep Learning illegal.” From his perspective, DL algorithms are unexplainable. 

The black box is still there. In 2020, the image cropping algorithm of Twitter was found racist. It will “automatically white faces over black faces” (Hern, 2020). Twitter soon apologized and released an investigation and improvement plan. However, in its investigation, the modelers stated that their “analyses to date haven’t shown racial or gender bias” (Agrawal & Davis, 2020), which means they did not figure out what leads to bias. They cannot tell where the potential harm comes from. In the future, they intend to change the design principle to “what you see is what you get” (Agrawal & Davis, 2020). In other words, they give up using the unexplainable algorithm and choose the intrinsically interpretable model. This is not the only example. According to ACLU, Amazon Rekognition shows a strong bias on race issues. Although Amazon responded that ACLU misused and mispresented their algorithm, “researchers at MIT and the Georgetown Center on Privacy and Technology have indicated that Amazon Rekognition is less efficient at identifying people who are not white men” (Melton, 2019).

All those cases happened in DL algorithms. The black box in DNNs algorithms comes from the way they work. They imitate human brains to build neurons and set false neural networks with several layers so that the learning algorithm could develop recognitions on what has been learned/processed  (Alpaydin, 2016). “They are composed of layers upon layers of interconnected variables that become tuned as the network is trained on numerous examples” (Dickson, 2020). The theory of DL algorithms is not difficult to explain. In this video, the host thoroughly describes of principles of it and introduces simple DL algorithms. 

However, real cases are much more complex. In 2020, Microsoft released the largest DL algorithm about NLP– Turing Natural Language Generation (T-NLG). It contains 1.7 billion parameters. There are also other algorithms containing billions of parameters like Megatron LM by Nvidia and GPT-2 by OpenAI. How those large algorithms use parameters and combine them to make decisions is currently impossible to explain. “A popular belief in the AI community is that there’s a tradeoff between accuracy and interpretability: At the expense of being uninterpretable, black-box AI systems such as deep neural networks provide flexibility and accuracy that other types of machine learning algorithms lack” (Dickson, 2020). Therefore, it forms a vicious circle. People are constantly building DNNs algorithms to solve complicated problems, but they cannot clearly explain how their programs make decisions. For that, they, then start building new models try to interpret algorithms. However, for models to explain these DNNs algorithms (the black box), people have fierce disputes over them. Professor Rudin thinks that this approach is fundamentally flawed. The new explanation model is guessing instead of deducing. “Explanations cannot have perfect fidelity with respect to the original model. If the explanation was completely faithful to what the original model computes, the explanation would equal the original model, and one would not need the original model in the first place, only the explanation” (Rudin, 2019). Therefore, it is still hard to de-blackbox the DL algorithms. Moreover, the black box is also in the proprietary algorithms, like ones mentioned above (Twitter and Amazon). Companies hide codes to keep the edge over their competitors (Dickson, 2020). That makes de-blackbox unreachable. Although they are working on their own businesses, the potential risks cannot be automatically eliminated. 

Conclusion

Interpretability (de-blackbox) is required by everyone. Companies need it to improve their algorithm quality so that they can make more profits. Individuals need it to ensure that their rights are not harmed and they are treated equally. Governments need it to construct more reliable institutions for people and the society. Although there are many methods to interpret algorithms, they cannot be used universally. How to make all algorithms interpretable should be explored. Governments and corporations should think more about using DL algorithms. Also, the consensus about the role that algorithms play in shaping society should be reached.  

 

Bibliography

Agrawal, P., & Davis, D. (2020, October 1). Transparency around image cropping and changes to come. https://blog.twitter.com/en_us/topics/product/2020/transparency-image-cropping.html

Alpaydin, E. (2016). Machine learning: The new AI. MIT Press.

Dickson, B. (2020, August 6). AI models need to be ‘interpretable’ rather than just ‘explainable.’ The Next Web. https://thenextweb.com/news/ai-models-need-to-be-interpretable-rather-than-just-explainable#:~:text=Interpretable%20AI%20are%20algorithms%20that,features%20of%20their%20input%20data.

Du, M., Liu, N., & Hu, X. (2019). Techniques for interpretable machine learning. Communications of the ACM, 63(1), 68–77. https://doi.org/10.1145/3359786

Gonzales, R. (2019, November 7). Feds Say Self-Driving Uber SUV Did Not Recognize Jaywalking Pedestrian In Fatal Crash. NPR.

Hern, A. (2020, September 21). Twitter apologises for “racist” image-cropping algorithm. The Guardian. https://www.theguardian.com/technology/2020/sep/21/twitter-apologises-for-racist-image-cropping-algorithm

Kabul, I. K. (2017, December 18). Interpretability is crucial for trusting AI and machine learning. The SAS Data Science Blog. https://blogs.sas.com/content/subconsciousmusings/2017/12/18/interpretability-crucial-trusting-ai-machine-learning/

Kabul, I. K. (2018, March 9). Understanding and interpreting your data set. The SAS Data Science Blog. https://blogs.sas.com/content/subconsciousmusings/2018/03/09/understanding-interpreting-data-set/#prettyPhoto

Kim, B., Koyejo, O., & Khanna, R. (2016). Examples are not enough, learn to criticize! Criticism for Interpretability. Neural Information Processing Systems.

Ledford, H. (2021, May 8). Millions of black people affected by racial bias in health-care algorithms. Nature. https://www.nature.com/articles/d41586-019-03228-6

Melton, M. (2019, August 13). Amazon Rekognition Falsely Matches 26 Lawmakers To Mugshots As California Bill To Block Moves Forward. Forbes. https://www.forbes.com/sites/monicamelton/2019/08/13/amazon-rekognition-falsely-matches-26-lawmakers-to-mugshots-as-california-bill-to-block-moves-forward/?sh=7c5f602d7350

Miller, T. (2018). Explanation in Artificial Intelligence: Insights from the Social Sciences. ArXiv:1706.07269 [Cs]. http://arxiv.org/abs/1706.07269

Molnar, C. (2019). Interpretable machine learning. A Guide for Making Black Box Models Explainable. https://christophm.github.io/interpretable-ml-book/

Ribeiro, M. T., Singh, S., & Guestrin, C. (2016a). Model-Agnostic Interpretability of Machine Learning. ArXiv:1606.05386 [Cs, Stat]. http://arxiv.org/abs/1606.05386

Ribeiro, M. T., Singh, S., & Guestrin, C. (2016b). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. ArXiv:1602.04938 [Cs, Stat]. http://arxiv.org/abs/1602.04938

Vincent, J. (2019, April 8). AI systems should be accountable, explainable, and unbiased, says EU. The Verge. https://www.theverge.com/2019/4/8/18300149/eu-artificial-intelligence-ai-ethical-guidelines-recommendations

Wikipedia. (n.d.). Decision tree. https://en.wikipedia.org/wiki/

Wu, T. (2021, April 21). Apple users are charged more on odering a cab than Android users. The Paper. https://www.thepaper.cn/newsDetail_forward_12314283

How to contribute to better future

Before this course, artificial intelligence, in my mind, is a kind of untouchable mysterious domain for ordinary people without a technological background to involve in. Even some of my friends studying in data science or information system don’t know the rules behind the computer or the algorithm. We all know technology is dominant, and AI will change the world, but we don’t know how it will change our world and lives. We absorb information from media stories, so we put AI in an overstate status. Individuals don’t see how those things work and their principles. They choose to understand it with those misleading terms, so some are afraid of emerging technologies like DL and ML. The reasons that those terms appear are pretty complicated. It may because the big companies intend to maintain their monopolies, or media agencies trying to take profits from it, or people have so many fantasies about futures and artificial intelligence. However, actually, we are still in the weak AI phase and still far from strong AI like what is performed in movies. Through this course, we learned how to de-blackbox it and convey concepts of the domain acceptably.

According to Dr. Irvine, we will de-blackbox it from four perspectives- ‘the systems view,’ ‘the design view,’ ‘the semiotic systems of view,’ ‘the ethics and policy view.’ From those four perspectives, we studied the software, the hardware, programming systems, semiotic systems like Unicode, and why it’s presented in specific ways. Furthermore, for ‘the dependencies for Socio-technical systems’ and considering ethics and policies, how should those techniques be regulated to adapt to human society.

In this week’s readings, we can see both political and academic institutions are making efforts to avoid the pessimistic predictions about AI. For example, the EU involves in promote General Data Protection Regulation (GDPR). It has made seven principles to regulate future AI. Also, universities are studying how to get better predictions from AI algorithms by adjusting their parameters.

However, I believe that the effects of AI depend on human society’s rules. In research of MIT Lab and documentary Code Bias, the high error rate in people of color’s facial recognition may intensify the bias towards the minority. Moreover, in another case, the algorithm itself may be racist. The health-care costs of different races may have distinguished differences. The former case will contribute to the inequities, and the inequalities contribute to the latter one. That will easily trap us into a vicious circle and leads to more severe social injustice. With AI or without AI, the predictions will be full of biases; after all, ML is based on human knowledge and incidents. I certainly expect a better future with more intelligent AI and believe our efforts will work to some extent. Still, I also think that what we should do to improve the quality of the results of programs is to improve our society first.

References

Brandom, R. (2018, May 25). Everything you need to know about GDPR. The Verge.

Gelman, A. (2019, April 3). From Overconfidence in Research to Over Certainty in Policy Analysis: Can We Escape the Cycle of Hype and Disappointment? The Ethical Machine.

Irvine, M. (n.d.). CCTP-607: Leading Ideas in Technology: AI to the Cloud. Retrieved April 17, 2021, from https://drive.google.com/file/d/1Hk8gLXcgY0G2DyhSRHL5fPQn2Z089akQ/view

Lipton, Z. C., & Steinhardt, J. (2018). Troubling Trends in Machine Learning Scholarship. ArXiv:1807.03341 [Cs, Stat]. http://arxiv.org/abs/1807.03341

Vincent, J. (2018, July 26). The tech industry doesn’t have a plan for dealing with bias in facial recognition. The Verge. https://www.theverge.com/2018/7/26/17616290/facial-recognition-ai-bias-benchmark-test

Vincent, J. (2019, April 8). AI systems should be accountable, explainable, and unbiased, says EU. The Verge. https://www.theverge.com/2019/4/8/18300149/eu-artificial-intelligence-ai-ethical-guidelines-recommendations

Avoid the Fallacies of Empiricism

With we entering the internet era, Big Data, accordingly, comes out. “The impact of digital data on society is very great and increasing. Social networks and big data determine what is noticed and acted upon” (Johnson et al., 2018). How to use it? How to correctly use it? How to deal with the consequences of it? We need to consider more and more things, especially we increasingly stress it in the domains of AI and ML. How to solve the challenges like capturing data, data storage, data analysis, etc., when we are using Big Data to train our AI/ML algorithms (Wikipedia)? On the way to de-blackbox, those technical issues should not be ignored, but this week, I want to put the emphasis on the attitudes towards it and the ethical problems.

Big data is not a simple term that we can understand as its literal meaning. “Big data is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software” (Wikipedia). In Kitchin’s book, he gives readers a better understanding of data’s meaning by building Knowledge pyramid and classifications standards. He also points out that “how data are ontologically defined and delimited is not a neutral, technical process, but a normative, political, and ethical one that is often contested and has consequences for subsequent analysis, interpretation, and action” (Kitchin, 2014).  The point can also relate to views proposed by Cathy O’Neil, which are, in commonsense, predictive but too negative. She believes that the data is not objective and accountable, and the ‘standards of success’ are full of bias. She strongly suggests issuing laws, proposing independent audits, and building moral norms for practitioners (O’Neil, 2016). Her attitude to big data is overly pessimistic, with many negative examples. Data and algorithms are all tools for human beings. It is us that to determine how to use it. We should develop the ability to access and analyze the data instead of attributing the problems to the data itself. Like gender and ethical issues, the examples she gives should be attributed to human society rather than tracks/data we made in our activities. What she advocates could be seen in the data infrastructure, which contains social and cognitive structures. “Data infrastructures host and link databases into a more complex sociotechnical structure” (Kitchin, 2014). It is complicated and hard to achieve, but I believe it’s the ultimate way to solve all the unfair and biased cases appearing in O’Neil’s book. We need to overcome the fallacies of empiricism to speed up to complete the industry’s criteria. And for the ethical and moral part, the point is the same. It is the person that decides what to do, not the algorithms. There definitely will be lots of negative and harmful cases during the development process, but as time goes by, I believe we could build a reasonable system for Big data and AI/ML, exactly like how we create social systems since ancient times.

Reference

Johnson, J., Denning, P., Delic, K. A., & Sousa-Rodrigues, D. (2018). Ubiquity Symposium: Big data: Big data or big brother? That is the question now. Ubiquity, Volume 2018(Number August (2018)), Pages 1-10. https://doi.org/10.1145/3158352

Kitchin, R. (2014). The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. Sage Publications.

O’Neil, C. (2016). Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown Publishers.

“Big data.” In Wikipedia, April 10, 2021. https://en.wikipedia.org/wiki/Big_data.

 

Is Cloud Computing a Familiar System?

“Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources” (NIST). There are leading companies- Amazon, IBM, Microsoft, and Google- providing Cloud computing services- SaaS, Paas, and IaaS. It’s really a black box for people and users without learning notions and concepts of them. When I am reading, the second writing topic reminds me of the closed ecosystem of Apple. And I think understanding the subject by comparing it with the case that I’m familiar with is a great method for me to de-blackbox it. There are millions of customers using the unified architecture- IOS, including me. For pros, it’s highly effective and convenient when all your electronic devices are Apple products. All products are highly compatible with each other, based on iCloud, Airdrop, Sidecar, etc. For example, you can store some information on your Mac and access it on other Apple devices via iCloud; the iPad is an extension of iPhone and Mac, which gives you access to all the same information. Moreover, thanks to AirDrop, iMessage, and FaceTime on macOS, functions like unlocking a Mac laptop with an Apple Watch or auto-pairing and locating missing AirPods are all available via the Apple ecosystem. For cons, since the technologies change fast, the cost will be prohibitive when one intends to move into another ecosystem for both hardware and software.

It will work in similar ways if a company runs its business on a unifying architecture provided by a specific company like Microsoft. Microsoft provides all SaaS, PaaS, and IaaS through Microsoft Azure. If a company chooses Microsoft, for SaaS, it can access all modules or applications they need through one account. Employees could communicate through Outlook, have meetings on Zoom, and work on Office 365. Through a supplier like Microsoft, the company could improve efficiency by mobilizing the workforce, and its employees can access app data everywhere. For PaaS, Microsoft offers a platform to let developers customize cloud-based applications using built-in software components and to “allow organizations to analyze and mine their data, finding insights and patterns and predicting outcomes to improve forecasting, product design decisions, investment returns, and other business decisions” (What is PaaS? Microsoft). It will improve the firm’s efficiency by cutting coding time and adding development capabilities without adding staff. For IaaS, the corporation can customize its own servers and infrastructure regarding its demands on storage, security, and data plant. It will enhance the flexibilities for the company to run its business. The cons are also similar; if a company buys all services it needs on Microsoft, the future change would be hard and expensive to achieve. Besides, the risks will be high to cooperate with a single supplier.

Reference

Cloud Computing – NIST. Retrieved April 2, 2021, fromhttps://csrc.nist.gov/projects/cloud-computing

Ruparelia, N. (2016). Cloud computing. The MIT Press.

What Is PaaS? – Microsoft Azure. Retrieved April 3, 2021, from https://azure.microsoft.com/en-us/overview/what-is-paas/#:~:text=Platform%20as%20a%20service%20(PaaS,%2C%20cloud%2Denabled%20enterprise%20applications.

Weekly Takeaways

Jillian

The virtual assistant (VA) is now everywhere in our daily life. “It is a software agent that can perform tasks or services for an individual based on commands or questions” (Wikipedia). The well-known Vas- Alexa, Siri, Cortana, etc.- have different focuses. This week I will be concentrating on Amazon Lex’s chatbots function and attempting to de-blackbox it.

“Amazon Lex is a service for building conversational interfaces into any application using voice and text, which powers the Amazon Alexa virtual assistant” (Wikipedia). According to Amazon Web Service (AWS), “Amazon Lex is a service for building conversational interfaces into any application using voice and text.” It offers advanced deep learning functionalities of automatic speech recognition (ASR) to convert speech to text and natural language understanding (NLU) to identify text to build applications with highly immersive user interfaces and life-like conversational interactions. It usually works with other programs to form a well-functional application architecture like Echo and Alexa.

As it says above, Lex involves in ASR & NLU. For the speech (ASR) part, users speak to the software via an audio feed, and the computer will accordingly create a wave file of words, which will be cleaned by removing background noise and normalizing volume. The filtered waveform will be broken down into small parts- phonemes. Each phoneme is like a chain link and by analyzing them in sequence. The ASR algorithm (RNN we learned last week is an ASR algorithm) uses statistical probability analysis from the first phoneme to deduce whole words and then, from there, complete sentences. When the program knows the sentence, it will provide reasonable responses to users based on its dataset. For the text (NLU) part, still focusing on RNN- the encoder-decoder architecture, users input words or sentences, which will be converted to numeric values-vectors by the algorithm so that the computer will understand. Again, when the program knows the sentence, it will provide reasonable responses to users based on its dataset. The dataset could be set by developers through supervised learning and be accomplished by unsupervised learning.

I recommend checking this workshop case related to Amazon lex collaborating with other APIs (https://github.com/aws-samples/amazon-lex-customerservice-workshop). This is an easy implementation of AWS’s modules.

 

NLP in Languages

“Challenges in natural language processing frequently involve speech recognition, natural language understanding, and natural-language generation”(Wikipedia). I started weekly readings with the question related to patterner cognition I had proposed last week- how could the computer tell the difference between characters in Japanese and characters in Chinese in Machine Translation (MT) without their codes. I ignore the fundamental rules in natural language to achieve language understanding with only the de-blackboxing of Optical Character Recognition (OCR) and Convolutional Neural Networks (ConvNet). Natural languages have two components- “tokens (smallest units of language) and grammar (defines the ordering of tokens)” (CS Dojo Community, 2019). In the processes of OCR, the computer would easily tell the difference between characters in Japanese and characters in Chinese since two languages have distinct grammar/Phase Structure Rules. There are examples in two languages with the same meaning- I’ll go to the school-:

  • 我要去学校。(Chinese)
  • 学校へ行きます。(Japanese)

Although they have the same tokens for “school” (学校), they work in different grammars. In Chinese, the syntax goes with the order- subject→ predicate→ object. While in Japanese, the syntax goes with the order- subject→ object→ predicate (with habitual omit of the first person- I). In other words, for Parse Trees of two languages, verbal phrases will appear in the last part of sentences in Japanese, while in Chinese, they will appear in the middle of sentences. To reach coherent syntaxes and semantics in MT, the computer will recognize two different languages in the paragraph effortlessly.

In the progress of natural language understanding and natural language generation, an Encoder-Decoder Architecture with Recurred Neural Network (RNN) is used. (CS Dojo Community, 2019) The computer firstly converts a source language sentence into a vector with a current RNN. Then, to convert this vector into a target language sentence, another neural network will be introduced. The important point is, for perfection, algorithms need to have a huge amount of datasets to train themselves.

Moreover, we learned some in previous sections for speech recognition- the computer could decode and encode the sound waves by assigning values to acoustic signals vertically (according to the amplitude). To make it more obvious, we will use spectrograms to visualize it. As each language has a certain amount of sounds, it could be trained, again, using DL algorithms. (CrashCourse#36)

Questions:

According to professor Liang, “there’s probably a qualitative gap between the way that humans understand language and perceive the world and our current models.” Is this gap could be closed only with more advanced techniques? Or do we still need a new linguistic theory to systematize human language? Is the current mainstream linguistic theory enough for NLP?

Reference

CrashCourse. 2017. Natural Language Processing: Crash Course Computer Science #36https://www.youtube.com/watch?v=fOvTtapxa9c.

CrashCourse. 2017. Machine Learning & Artificial Intelligence: Crash Course Computer Science #34https://www.youtube.com/watch?v=oi0JXuL19TA.

CS Dojo Community. 2019. How Google Translate Works – The Machine Learning Algorithm Explained! https://www.youtube.com/watch?v=AIpXjFwVdIE.

“Machine Translation.” 2021. In Wikipediahttps://en.wikipedia.org/w/index.php?title=Machine_translation&oldid=999926842.

“Natural Language Processing.” 2021. In Wikipediahttps://en.wikipedia.org/w/index.php?title=Natural_language_processing&oldid=1009043213.

“The Technology behind OpenAI’s Fiction-Writing, Fake-News-Spewing AI, Explained.” n.d. MIT Technology Review. Accessed March 6, 2021. https://www.technologyreview.com/2019/02/16/66080/ai-natural-language-processing-explained/.

Fascinating and Powerful Pattern Recognition

This week, a practical application of ML/AI is introduced- Pattern Recognition. Recently, Convolutional Neural Networks (ConvNet), a Deep Learning algorithm, is the most popular way to achieve pattern recognition. In the blog of Karpathy, he also used ConvNet to decide what is a good selfie. According to the classification system in Dougherty’s Pattern Recognition and Classification, Karpathy firstly found a mass database with 500 million images with the hashtag of selfie. Then, for pre-processing, he used another ConvNet to label 200 million images that contain at least one face. He input the standards of a good selfie (can we say the standard he put in is a hyper-parameter?) to extract features/kernels. It is worth noting that, in the experiment, the standard he took, which I was skeptical about before I read the article, is fair-ranking with certain weights for the audience, likers, and followers (it would be potentially useful for influencers on social media; and since time may be a critical influence factor, can we also add it into the standard?).  After those steps, he got a sufficient dataset to train his ConvNet model with Caffe, a deep learning framework. The model then processed the dataset in its hidden layers to give the classification results.

This experiment demonstrates the word from Crash Course #35- “abstraction is the key to building complex systems, and the same is true in computer vision.” The abstraction in the experiment is complicated since there are too many features in a photo to consider, and that’s really fascinating. Since, according to Dougherty, document recognition is also a part of pattern recognition, I am wondering how the algorithm could tell the difference in the translation process between characters in Japanese and characters in Chinese without their codes (for example, “学生” means “student” in both Chinese and Japanese)? I know the pattern recognition algorithm is context-sensitive, but does it mean the translating algorithm need to train with both Chinese and Japanese dataset? 

How Data Works over the Level of E-Information

In digital computing discourse, the term “data” is also different from a traditional context like information is. According to Professor Irvine, there is “no data without representations.” For whatever types of data, they must be interpretable in any software layer, processable by the computing system, and storable as files in memory. That is, “any form of data representation must be computable; anything computable must be represented as a type of data.” This concept is also connected to the information theory. As we learned last week, information, in the digital computing context, is a physical concept. It is encoded to the electronic signals that are communicable in the transmitting channel but unobservable. In this case, “data” is more similar to the meta-information (which is the information in the generic-information sense).

Based on that, it could be explained why “data” is a level above “information.” E-information, at its level, structures “the code for data at next level up and code for operations, interpretations, and transformation of, or over, the representations.” Information could be understood that it plays the function role over the physical computer system level with its strings of binary codes so that data, at the next level up, could be interpretable by a human (as representations) and computable by the computer.

All formats, including text (like TXT) and images (like JPEG), are the same. They have “long lists of numbers, stored as binary on a storage device.” are encoded as “digital data.” In text format, words are coded by Unicode with different character encodings so that words (or visual symbols) can be represented on the computer in different languages. Take Emojis as an example. In the Emoji system, each emoji has a unique codepoint, and it could be combined with another codepoint to form a new emoji. For example, the code of “👶” (baby) is “1F476” (it will be transferred into binary code so that the computer could interpret it). It can be combined with another color code like “1F3FB”; then, we will get a baby with light skin tone- “👶🏻” (“1F476+1F3FB”). For other formats, like image, it works in a similar mode. Images are formed by pixels, which are combinations of three colors- red, green, and blue. “An image format starts with metadata (key values for image), such as image width, image height, and image color.” Colors on each pixel could be divided into three parts- red, green, and blue (each part has a maximum of 8 bits/ 1 byte). For example, “000” is white, which means it has zero intensity of red, green, blue (the biggest value on each color is 255). With each value on the pixel, we could get an image with certain amount pixels. In the process, the code for each pixel will also be transfer to binary code for computer interpretation.

Reference

Irvine, M. Introduction to Computer System Design, 2020.

Kelleher, J. D., and B. Tierney. Data Science. MIT Press, 2018. 

Irvine, M. Introduction to Data Concepts and Database Systems, 2021.

“Unicode.” In Wikipedia, February 21, 2021. https://en.wikipedia.org/w/index.php?title=Unicode&oldid=1008164095.

CrashCourse. 2017c. Files & File Systems: Crash Course Computer Science #20https://www.youtube.com/watch?v=KN8YgJnShPM&list=PL8dPuuaLjXtNlUrzyH5r6jN9ulIgZBpdo&index=21.

 

Question:

Can data be interpreted as a similar term for the meta-information in the digital computer sense?

 

 

Information Theory Model and Symbolic System

In the theory of Information, Shannon defines a message as a choice. In this term, Information separates from its semantic content; that is, Information transfers to the physical environment from the psychological one. To interpret the signal transmission in the theory of Information, according to Floridi, how Shannon deal with signals needs to be understood first- “he treated the signal as a string of discrete symbols.”  Combining with features of Information- Information is closely associated with uncertainty; there are probabilities in conveying Information; it’s difficult to transmit information from one point to another, and Information is entropy- a sender can overcome noise by using extra symbols for error correction instead of boosting the power. In addition, in the transmission, Shannon focused on how much of a message influences the probability of the next symbol on the base of statistical language structure so that it will allow for a saving of time or channel capacity. Relatively, he gave the formula and a new unit of measure- bit, and worked out how to calculate the redundancy in a piece of Information. Based on that, signal transmission (encoding) could be more effective.

According to Professor Irvine in Using the Principle of Levels for Understanding “Information,” “Data,” and “Meaning”,”  ‘information’ in electronic systems are designed as units of “structure-preserving structures,” which use the physics of electricity as a medium.” Information is encoded to the electronic signals that are communicable in the transmitting channel but unobservable. The traditional cultural or social meanings are not embodied in messages but shared by the group of people communicating. Therefore, those meanings (meta-information) cannot be represented in signal information.

In the theory, information is not content (it technically has no meaning). What we do with it is to utilize it to achieve more and faster communication (in our symbolic system). It cannot extend since it only occurs in the sender, channel, and receiver. Also, since it happens with the assistance of energy fields, electronic pulses, and signals used as binary representations, which we could not observe, so it can only be regarded as tools. “A fuller semiotic systems model provides a model of other levels of abstraction that explains how we use, express, perceive, understand, and interpret instances of tokens (physical-perceptible instances) of all our symbolic structures and media (language, writing, abstract symbols, graphs and diagrams, images, sounds, music, and video/film).” To be more specific, the information theory model is a subsystem of symbolic systems, which provides a level of abstraction for discrete symbols. It could not be interpreted in daily communication and become a new model for the symbolic system.

Questions:

Is token an observable unit in our symbolic system that is not discrete? And is this drawn to become the counterpart of signal in the information theory model?

Magical DL, and How to Plan for the Future?- Jianning Wu

There are several new conceptions for me in the readings of this week. From Alpaydin and Kelleher, we know that Deep Learning intimates human brains to build neurons and set false neural networks with several layers so that the learning algorithm could develop recognitions on what has been learned/processed. In the meantime, from the Introductory Essay written by Prof. Irvine, we learned that how modern computers work on the basis of the binary system (0&1), which makes DL even more magical since mentioned by Alpaydin in Machine Learning: The New AI, it studies with “hidden layer combining the values (which are not 0 or 1 but continuous allows a finer and graded representation of similar inputs).” In other words, it looks like DL is based on the binary system but surpasses that, which means DL supports the development of AI to learn more abstract things (more like humans).  

In addition, about the discourse in AI and ML, I also learned about several clarifications. According to the video- Techniques for Interpretable Machine Learning released by Association for Computer Machinery, “powered by complex models and deep neural networks, interpretable ML is progressing at an astounding rate; however, despite the successes, ML has its limitations and drawbacks- some decisions made by algorithms with ML are hard to interpret.” This fact relates to Reframing AI Discourse. ‘Machine autonomy’ is not equal to human autonomy. Although designers set patterns for the AI system, the AI will become an entity (run by rules that may be unexpected when encountering real problems). This kind of entity does not mean AI can determine where it will go by itself but become an independent program if there is no intervention. However, this also exposes practical problems: how should we make the regulations for an AI system; how could we evaluate the purpose of designers (whether we can get help from six principles proposed by Alpaydin); how should we make the guide for AI practice? Assumptions are assumed for us to predict the future, but questions are asked to solve. Although Johnson and Verdicchio said that the popular AI concepts are futuristic and too hard to achieve, we need to plan for the future. 

What is AI for? And the Ethics in Data – Jianning Wu

As I have gained from the reading this week, with inspiration from the brain, Artificial Intelligence is a designed practical computer/machine that can mimic humans and make decisions as humans do. It is not an easy job and requires supportive hardware, learning abilities, and adaptive programs/algorithms. Although, currently, Artificial Intelligence is not that intelligent compared with what has been presented in universal imagination or “aspects of hype, hope, and hysteria” (cited from Prof. Irvine), it has taken us a long time to achieve, and it is everywhere. But what the ultimate purpose or goal of artificial intelligence is? As Hebert A Simon discussed the functional or purposeful aspect of artificial things, he mentioned that “fulfillment of purpose or adaptation to a goal involves a relation among three terms: the purpose or goal, the character of the artifact, and the environment in which the artifact performs.” The latter two terms are interacted and serve together for the purpose of an artifact. Is the AI just like its literal meaning- making an intelligence like a human? Or allowing computers and machines to function in an intelligent manner? Different goals will lead the consequence in different ways. In the former pattern, AI will become a moral challenge in the future when computing techniques are mature enough for constructing areal AI. Adversely, in the latter term, AI will assist us in making progress whatever it is ripe or not.

In addition, in the data section in Machine Learning: The New AI, Ethem Alpaydin demonstrated the importance of data- “data starts to drive the operation; it is not the programmers anymore but the data itself that defines what to do next.” Though the author provides us great ways of using data, such as helping build a better structure in retailers’ supply chains, this makes me curious and even worry about how we could guarantee those data will be used acceptably? How could we protect our private information? How could we ensure that our data will not be taken advantage of to achieve a particular party’s goals (like in an election or a business competition)? How media stay equitably and neutrally while they need to filter a huge amount of data/information in the meantime? And there are dozens of questions about data. The most critical one is how we could maintain the balance between ethics and utilization of data?