Category Archives: Final Project

Design/Ethical Implications of Explainable AI (XAI)

Abstract

This paper will address the research question: what are the design and ethical implications of explainable AI? This paper will argue that there are three main reasons why XAI is necessary for user trust. These reasons pertrain to accountability/trustworthiness, liability/policy evaluation, and human agency/authority. This paper will use de-blackboxing methods and an analysis of current research and examples to uncover how XAI is defined, why it is necessary, and major benefits and criticisms of XAI models. With supporting evidence from Miller et al, the paper will argue that defining explainability to include human explanation models (cognitive psychology/sciences) will be significant to the development of XAI discourse.

Artificial Intelligence applications that use neural networks are able to produce results (i.e. image classification) with high accuracy, but without explanation for human end users, therefore classifying it as a black box system (Abdallat, 2019). Many articles claim that AI should be explainable but are not clear about how “explainable” is defined. This paper will de-blackbox explainable AI (XAI) by looking at how it is defined in AI research, why we need it, and specific examples of XAI models. Finally, it will address one of major gaps in current XAI models by arguing that explainable AI research should adopt an interdisciplinary research approach by building on frameworks of explanations from social science (Miller, Howe, & Sonenberg, 2017).  

XAI Types & Definitions

Opaque Systems

An opaque systems inner workings are invisible to the user. The system is taking in information and outputting new information or predictions, without clear evidence to why or how the output was chosen. In the case where an algorithm can’t provide the programmer with reasoning behind it’s decision-making process, this is considered a “black box” approach, and classified as opaque. Additionally, opaque systems often emerge when closed-source AI is licensed by an organization, and therefore hidden from the public in protection of IP (Doran, Schulz, & Besold, 2017).  

Interpretable Systems

An interpretable system is a transparent model that allows the user to understand how inputs are mathematically mapped to outputs. One example is a regression model, which is linear and uses weights to rank importance of each feature to the mapping. On the other hand, deep neural networks have input features which are learned from non-linearities, therefore would not be considered an interpretable model (Doran, Schulz, & Besold, 2017).

Comprehensible Systems

Comprehensible systems “emit symbols enabling user-driven explanations of how a conclusion is reached. These symbols (most often words, but also visualizations, etc.) allow the user to relate properties of the inputs to their output. The user is responsible for compiling and comprehending the symbols, relying on her own implicit form of knowledge and reasoning about them” (Doran, Schulz, & Besold, 2017).

Why Do we need XAI?

The three main reasons we need AI are as follows:

  1. Accountability + Trustworthiness
  2. Liability and Policy Evaluation
  3. Human Agency and Authority

Accountability, Liability and Policy Evaluation

Explainable AI is specifically important in cases dealing with human health, safety, and liability issues. In these cases, it is ethical to hold someone accountable for incorrect or discriminatory outcomes. Additionally, the issue of explanablity is a factor that can inform policy on whether AI should be incorporated into certain sensitive fields (Paudyal, 2019). For example, should a process like driving a motor vehicle be automated? These questions illuminate the importance of critical discourse that asks hard questions such as: what we are willing to sacrifice as a society for automation and convenience? In 2018, a self-driving car knocked down and killed a pedestrian in Tempe, Arizona (Paudyal, 2019). “Issues like who is to blame (accountability), who to prevent this (safety) and whether to ban self-driving cars (liability and policy evaluation) all require AI models used to make those decisions to be interpretable (Paudyal, 2019). In this case, I argue that when the safety of the public is concerned, it is clear that XAI is necessary.

Trust

Trusting a neural network to make decisions will have different implications depending on the task required. One of the strongest arguments for XAI is within the medical domain. If a neural network is built to predict health outcomes for a patient (risk of cancer or heart disease) based on their records, but can’t provide reasoning for the decision – is it ethical to trust it? The lack of transparency is a problem for the clinician who wants to understand the model’s process, as well as the patient who is interested in the proof and reasoning behind the prediction (Ferris, 2018). According to Ferris, empathy is a strong component to the patient-client relationship that should be taken into account when implementing these systems. In the case of medical predictions, I argue that XAI is necessary to ensure a level of trust with their physician. The point of predictive models and algorithms is to help advance user experience (as well as the experience and knowledge of the experts). In the case of patient-physician relationship, trust should be prioritized and XAI methods should be incorporated to support that.

XAI EXAMPLES

Reversed Time Attention Model (RETAIN)

The RETAIN explanation model was developed at Georgia Institute of Technology by Edward Choi et at (2017). The model was designed to predict if a patient was at risk for heart failure using patient history (including recorded events of each visit). This model aims to address the performance vs. interpretability issue (mentioned in criticism section). “RETAIN achieves high accuracy while remaining clinically interpretable and is based on a two-level neural attention model that detects influential past visits and significant clinical variables within those visits (e.g. key diagnoses). RETAIN mimics physician practice by attending the EHR data in a reverse time order so that recent clinical visits are likely to receive higher attention” (Choi et al., 2016).  

Image Source:

By splitting the input into two recurrent neural nets (pictured above), the researchers were able to use attention mechanisms to understand what each network was focusing on. The model was “able to make use of the alpha and beta parameters to output which hospital visits (and which events within a visit) influenced its choice” (Ferris, 2018).

Local Interpretable Model-Agnostic Explanations (LIME)

Post-hoc models provide explanations after decisions have been made. The key concept in the LIME model is perturbing the inputs and analyzing the effect on the model’s outputs (Ferris, 2018).  This is an example of an agnostic model, meaning the process can be applied to any model and produce explanations. By looking at the outputs, it can be inferred what aspects the model is focusing on. Ferris uses the example of a CNN image classification to demonstrate how this model works in four steps.

Step 1. Begin with a normal image and use the black-box model to produce a probability distribution over the classes.

Step 2. Alter the image slightly (ex. hiding pixels), then run the black-box model again to determine what probabilities changed.

Step 3. Use an explainable model (such as a decision tree) on the dataset of perturbations and probabilities to extract the key features which explain the changes. “The model is locally weighted (we care more about the perturbations that are most similar to the original image.”

Step 4. Output the features (in this case, pixels) with the greatest weights as the explanation (Ferris, 2018).

Criticism and Challenges of XAI

  • Complexity Argument

There are a few major criticisms of explainable artificial intelligence to consider. Firstly, neural networks and deep learning models are multi-layered and therefore complex and overwhelming to understand. One of the benefits of neural networks it’s ability to store and classify large amounts of data (human brains could not process information in this way). According to Paudyal,  “AI models with good performance have around 100 million numbers that were learned during training” (2019). With this in mind, it is unrealistic to track and understand each layer and process of a neural network, in order to find valid source for explanation.

G-Flops vs accuracy for various models | Image source: Paudyal, 2019

  1. Performance vs. Explainability Argument

The second main criticism of XAI is that the more interpretable a model is, the more the performance lags. This ethical implication of this is that efficiency make take precedence over explanation, which could lead to accountability issues.

“Machine learning in classification works by: 1) transforming the input feature space into a different representation (feature engineering) and 2) searching for a decision boundary to separate the classes in that representation space. (optimization). Modern deep learning approaches perform 1 and 2 jointly by via. a hierarchical representation learning” (Paudyal, 2019).

Image Source: Paudyal, 2019

Performance is the top concern in advancement of the field, therefore explainable models are not favored when performance is affected. This factor supports a need for non-technical stakeholders to be a part of the conversation surrounding XAI (Miller, Howe, & Sonenberg, 2017). If the only people with a voice are concerned with performance, it could lead to focus on short-term outcomes rather than the longer term implications for human agency, trustworthiness of AI, and policy.

An Alternative Method: Incorporating XAI Earlier in the Design Process

In contrast to most current XAI models, Paudyal argues that  deciding if an application needs explanation should be discussed early enough to be incorporated into the architectural design (2019).

Image Source: Paudyal, 2019

As an alternative to using simpler but explainable models with low performance, he proposes that (1) creators should know what explanations are desired through consultation with stakeholders and (2)  the architecture of the learning method should be designed to give intermediate results that pertain to these explanations (2019). This decision process will require an interdisciplinary approach, because it is clear that in defining and understanding what type of explainability is needed for a specific application will require discussion across disciplines (computer science, philosophy, cognitive psychology, sociology). “Explainable AI is more likely to succeed if researchers and practitioners understand, adopt, implement, and improve models from the vast and valuable bodies of research in philosophy, psychology, and cognitive science; and if evaluation of these models is focused more on people than on technology” (Miller, Howe, & Sonenberg, 2017).  These disciplines should be working together to discover what systems require explanation and for what reasons, before implementation and testing begins. In the next section, I will de-blackbox this method further by providing limitations and illustrating the method with an example.

Example & Limitations

Paudyal addresses that for this method different applications will require different explanations (loan application vs. face identification algorithm). Although this method would not be agnostic, it supports the fact that complex systems will not be able to be explained in simple ‘one size fits all’ approaches. It is important to address this challenge in order to come up with realistic XAI models that include the socio-political and ethical implications into the design.

Example

The following case looks at the possibility of a system designed to incorporate explanations for an AI application that teaches sign language words. In a normal black-box application system, the AI would identify an incorrect sign but would not be able to give feedback. In this case, explanation will be equivalent to feedback about what was wrong in the sign.  Paudyal’s research found “Sign Language Linguists have postulated that they way signs differ from each other either in the location of signing, the movement or the hand-shape” ( 2019).

Image Source: Paudyal, 2019

With this information, AI models can be trained to focus on the these three attributes (location of signing, movement and hand-shape). When a new learner makes a mistake, the model will be able to identify which mistake was made and provide the appropriate specific feedback (Paudyal, 2019).

Image Source: Paudyal, 2019

The main insight found through this example is that AI models which use possible outcomes in the design of the application, are easier to understand, interpret, and explain. This is due to the human design knowing what the application will be training for. This example also supports the earlier statement that the design will be specific to the application (this process is specific to sign language CNN).

Conclusion

This paper examined several issues with lack of transparency in machine learning and utilization of deep neural networks, specifically in scenarios where responsibility is hard to determine and analyze for policy. These challenges in the AI field have resulted in efforts to create explainable methods and models. From here, another significant challenge was introduced in defining explainability. Through the examples and cases mentioned, it is clear that explainability will have different meaning depending on various factors including the user’s comprehension, background, and industry. Due to this, I argue (with support from Paudyal’s argument) that explainability should be discussed in the first stages of the design process. In doing so, the process is made more clear and it is easier to develop XAI from the beginning of application design, rather than after it is created. This brings authority and agency back in to the hands of humans, and addresses the argument that explainability will affect performance. Although incorporating explanation earlier in the design does have some limitations, it may ultimately lead to better design practices that do not focus on short-term outcomes. Lastly, I close by arguing explainability calls for interdisciplinary collaboration.  “A strong understanding of how people define, generate, select, evaluate, and present explanations” is essential to creating XAI models that will be understood by users (and not just AI researchers) (Miller, 2017). Further research might explore the questions: who is defining XAI, who is XAI designed to appease, and why aren’t experts in human explanation models at the forefront of approaching these questions?

References

Abdallat, A. J. (2019, February 22). Explainable AI: Why We Need To Open The Black Box. Retrieved from Forbes website: https://www.forbes.com/sites/forbestechcouncil/2019/02/22/explainable-ai-why-we-need-to-open-the-black-box/

Choi, E., Bahadori, M. T., Kulas, J. A., Schuetz, A., Stewart, W. F., & Sun, J. (2016). RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism. ArXiv:1608.05745 [Cs]. Retrieved from http://arxiv.org/abs/1608.05745

Doran, D., Schulz, S., & Besold, T. R. (2017). What Does Explainable AI Really Mean? A New Conceptualization of Perspectives. ArXiv:1710.00794 [Cs]. Retrieved from http://arxiv.org/abs/1710.00794

Ferris, P. (2018, August 27). An introduction to explainable AI, and why we need it. Retrieved  from freeCodeCamp.org website: https://medium.freecodecamp.org/an-introduction-to-explainable-ai-and-why-we-need-it-a326417dd000

Miller, T., Howe, P., & Sonenberg, L. (2017). Explainable AI: Beware of Inmates Running the Asylum. 7.

Miller, T. (2017). Explanation in Artificial Intelligence: Insights from the Social Sciences. ArXiv:1706.07269 [Cs]. Retrieved from http://arxiv.org/abs/1706.07269

Paudyal, P. (2019, March 4). Should AI explain itself? or should we design Explainable AI so that it doesn’t have to. Retrieved from Towards Data Science website: https://towardsdatascience.com/should-ai-explain-itself-or-should-we-design-explainable-ai-so-that-it-doesnt-have-to-90e75bb6089e

Samek, W., Wiegand, T., & Müller, K.-R. (2017). Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. ArXiv:1708.08296 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1708.08296

Music To My Ears: De-Blackboxing Spotify’s Recommendation Engine

Proma Huq
CCTP-607 Big Ideas in Tech: AI to the Cloud
Dr. Martin Irvine
Georgetown University 2019

Abstract

The average individual makes over 70 conscious decisions a day (Iyengar, 2011). To steer consumers through this maze of choices, recommendation system algorithms are perhaps one of the most ubiquitous applications of machine learning for online products and services (McKinsey, 2013). A prime example of this is evident in Spotify’s recommendation engine, which harnesses deep learning systems and neural networks to map accurate content suggestions for customized playlists, such as its “Discover Weekly” series. To further explore the paradigm of recommendation systems, the research question for this case study is “How Does Spotify’s Recommendation Algorithm Provide Accurate Content?” Key insights include de-blackboxing the algorithm, the process of collaborative filtering and matrix factorization, providing a deeper understanding of how Spotify gathers “taste analysis data”, thereby providing a positive user experience. 

Introduction

On any given day, the average individual makes a range of conscious decisions about their media consumption. As we navigate these omnipresent choices in our increasingly interconnected world, recommendation system algorithms that are now ubiquitous in our lives steer us towards them, like invisible Internet elves enabling our every whim or flight of fancy. Would you like to watch this show on Netflix? Perhaps you’d like to buy this on Amazon? Or the ever-present, “based on your interest in X, how about Y?” These helpful suggestions are part of an overarching “data-driven and analytical” approach to consumer decision-making (Ramachandran, 2018), fueled by recommendation engines. This is especially salient when it comes to media consumption, as these algorithms allow users to find relevant and enjoyable content, thereby increasing user satisfaction and driving engagement for a given product or service (Johnson, 2014). The manner in which a successfully designed recommendation algorithm drives growth in an organization is evident in the case of Spotify.

With over 200 million users worldwide, 96 million of whom are premium subscribers (Spotify 2019, Fortune, 2019) Spotify is a treasure trove of big data, with a recommendation algorithm that is ripe for de-blackboxing. Considering the current media affordances in terms of the technology of streaming music, Spotify and its recommendation algorithms are changing the way we discover, listen to and interact with music. Based on this, the primary research question for this paper is “How Does Spotify’s Recommendation Algorithm Provide Accurate Insights?”

Spotify in Numbers

Launched in 2008, Spotify is a Swedish start-up – an audio streaming platform that offers a “freemium” tier subscription service, earning revenue by means of advertising and premium subscription fees (Spotify, 2019). The free version, aptly named “Spotify Free”, only allows users to shuffle play songs and listening is interspersed with ads. At $9.99/month, Spotify Premium  gives users the freedom of choice, ad-free unlimited access and higher quality audio. The company offers over 40 million songs with an estimated 5 million playlists that are curated and edited (Spotify, 2018). 

 

Spotify boasts a wide range of genres, moods and even ocassion based playlists. In fact, as I write this paper, I’m listening to “Chill Lofi Study Beats”, a Spotify curated playlist that has 422,812 followers and is one of several playlists in the “Study” genre, as offered by Spotify.

Figure 1: Playlist Screenshot

In July 2015, Spotify launched its Discover Weekly playlist. As clearly apparent by it’s self-evident title, Discover Weekly is an algorithm-generated playlist that is released, (or, in colloquial music terms, “dropped”) every Monday, bringing listeners up to two hours of custom, curated music recommendations. Spotify also offers other customized recommendations in Daily Mixes, Release Radar and Recommended suggestions of playlists or artists.  Users claimed it was “scary” how well Spotify was able to discern their musical tastes and that the platform “knew” or “got” them. According to Spotify, by 2016 Discover Weekly had “reached nearly 5 billion tracks streamed” since it’s launch, a clear sign of its success as an algorithmic product offering. 

Source: Vox Creative

De-blackboxing The Recommendation Algorithm 

The primary aim of recommendation algorithms are to analyze user data in order to provide personalized recommendations. In terms of Spotify, Discover Weekly and other playlists are created using collaborative filtering, based on the user’s listening history, in tadem with songs enjoyed by users who seem to have a similar history. Additionally, Spotify uses “Taste Analysis Data” to establish a Taste Profile. This technology, developed by Echo Nest (Titlow, 2016), groups the music users frequently listen to into clusters and not genres, as the human categorization of music is largely subjective. Examples of this are evident in Spotify’s Discover Weekly and Daily Mix playlists suggestions. Clustering algorithms like Spotify’s group data based on their similarities. Alpaydin describes clustering as an “exploratory data analysis technique where we identify groups naturally occurring in the data” (Alpaydin, 2016 p. 115). Services like Spotify can cluster songs, genres and even playlist tones,  in order to train a machine learning algorithms to predict preferences and future listening patterns. 

Figure 2: How Discover Weekly Works. Source: Pasick, 2015.

Machine learning algorithms in recommender systems are typically classified under two main categories — content based and collaborative filtering (Johnson, 2014). Traditionally, Spotify has relied primarily on collaborative filtering approaches for their recommendations. This works well for Spotify, as it revolves around the strategy of determining user preference from historical behavioral data patterns. An example of this is if two users listen to the same sets of songs or artists, their tastes are likely to align. Christopher Johnson, former Director of Data Science at Spotify, who worked on the launch of Discover Weekly, outlines the differences between the two in his paper on Spotify’s algorithm. According to Johnson, a Content Based strategy relies on analyzing factors and demographics that are directly associated with the user or product, such as the age, sex and demographic of the user or a song genre or period, such as music in the 70’s or 80’s. Recommendation systems that are based on Collaborative Filtering take consumer behavior data and utilize it to predict future behavior (Johnson, 2014). This consumer behavior leaves a trail of data, generated through implicit and explicit feedback (Ciocca, 2017). Unlike Netflix, which from its nascence used a 5 star point rating system (WSJ, 2018), Spotify relied primarily on implicit feedback to train their algorithm. Examples of user data based on implicit feedback can be playing a song on repeat or skipping it entirely after the first 10 seconds. User data is also gleaned from explicit feedback (Pasick, 2015), such as the heart button on Discover Weekly or songs that were liked that automatically save in the library and “Liked from Radio” playlist. An example of the myriad other ways in which collaborative filtering and recommendation algorithms work in different approaches is evident in the diagram below ( see Figure 3). Spotify uses an amalgamation of 4 approaches – Attribute based, CF (item by item), CF (user similarity) and Model based. 

Figure 3: Recommender Algorithm Approaches. Source: Zheng, 2015.

Spotify further analyzes and applies user data by using a matrix decomposition method, which is also known as matrix factorization. The approach of matrix factorization aims to find answers by ‘decomposing’ (hence the term matrix decomposition) the data into two separate segments (Alpaydin, 2016). The first segment defines the user in terms of marked factors, each of which is weighted differently. The second segment maps between factors and products, which in the Spotify universe are songs, artists, albums and genres, thus defining a factor in terms of the products offered. In his book, Alpaydin provides an example to further elaborate on this as applied to movies. In the following diagram (see Figure 4) each customer has only watched a small percentage of the movies and the overall movies have only been watched by a small percentage of the customers. Based on these assumptions, the learning algorithm needs to be able to generalize and predict successfully.

Figure 4: Matrix decomposition for movie recommendations.
Source: Alpaydin, 2016.

Each row of the data matrix X contains movies (or for the scope of this case study, this can interchangeably be considered as music). Most of this data however, is missing, as the customer has not yet watched many of the movies, which is where the recommendation system comes in. The matrix then factors this into two splits – F and G – where F is factors and G movies/music. Spotify uses a matrix factorization application called Logistic Matrix Factorization or Logistic MF, to generate lists of related artists, for example for Artist ‘Radio’ playlists, based on binary preference data (Johnson, 2014). This matrix is established by calculating millions of recommendations based on millions of other user behavior and preferences, an example of which can be seen below, in Figure 6.

 

Each row of this sample matrix represents one of Spotify’s 200 million users.  Conversely, each column represents one of the 40 million songs in their database.

 

Figure 6: Spotify Matrix Snapshot. Source: Ciocca, 2017.

This is followed by the data being run through a matrix factorization formula, resulting in two different vectors, identified in this diagram (see Figure 7) as X and Y. In terms of Spotify, X represents the user and their preferences, while Y embodies the song, representing a single song profile (Ciocca, 2017).

Figure 7: User/Song Matrix. Source: Johnson, 2015.

Navigating Key Challenges: ConvNet & NLP

In previous years, Spotify encountered a “cold start problem” (Schrauwen, 2014) – when no prior behavioral or user data was available it was unable to use its existing CF model trained algorithms. Consequently, faced with a “cold start”, Spotify found themselves inept at providing recommendations for brand new artists or old or unpopular music. In order to navigate this, Spotify harnessed convolutional neural networks – known as CNNs or ConvNet – the same deep neural network technology used in facial recognition software. In the case of Spotify, the CNN has been trained within the set paradigms of audio, conducting a raw audio data analysis instead of examining pixels. The audio frames pass through the convolutional layers of the neural network architecture resulting in a “global temporal pooling layer” (Dieleman, 2014), the computation of learned features throughout the course of a single track. By identifying a song’s key characteristics, such as time, tone, tempo etc., the neural network “understands” the song, thereby allowing Spotify to identify and recommend similar songs and artists to targeted users – those who display the same behavioral past data – thus determining accuracy. Additionally, for further accuracy, Spotify uses NLP or Natural Language Processing in analyzing the “playlist itself as a document” (Johnson, 2015), using each song title, artist or other textual evidence to analyze as part of their machine learning recommendation algorithm.

Outliers: This is Not My Jam!

As a by product of this training, Spotify is smart enough to recognize and distinguish outliers. Alpaydin expains this as another application area of machine learning, termed outlier detection, where the aim this time is to find instances that do not obey the general rule—those are the exceptions that are informative in certain contexts.” (Alpaydin 2017 p. 72). For example, let’s imagine I recently watched Bohemian Rhapsody, the Queen movie, and happened to listen to a song by the band once, deviating from my usual stream of microgenres such as nu-disco, house and electro-funk. If Spotify, based on that outlier, now kept sending me recommendations to listen to Queen or other 70’s bands, I as a user may not obtain high levels of satisfaction from the service, thereby losing interest in it and feeling Spotify doesn’t “get” me. In the diagram below (see Figure 8) the user in question has a taste profile that primarily consists of the genres of funk/soul, indie folk and folk. The outlier in this case is a kid’s song, perhaps played for the author’s daughter a few times. The algorithm must be trained to follow the data nuggets on the trail of pattern recognition thereby eliminating any outliers for the recommendation algorithm.

Figure 8: Spotify core preference diagram. Source: Pasick, 2015

Ethical Implications 

Similar to the manner in which deep neural networks established paradigms for a “good” selfie by virtually eliminating people of color in a ConvNet training experiment (Karpathy, 2015), the defined parameters for recommendation algorithms can have a larger affect on music. The potential shortfall of collaborative filtering are rampant when machine learning design is trained only to exhibit certain results based on preexisting pattern recognition. Despite the fact that Spotify aims to neutralize this by factoring in other methods of data analysis, machine learning recommendation algorithms can still potentially bury other data, or in the case of Spotify, other music based on probabilistic inferences and predictions. 

Conclusion

In tandem with advances in technology and media affordances, future implications of machine learning include more personalized, immersive user experiences with progressively complex features. With recommendation algorithms choosing what content we watch, what we listen to and even our romantic relationships (du Sautoy, 2018), guiding users towards certain choices and away from others eliminates free choice, so to speak, ‘pigeon holing’ users. It is important to remember however, that ultimately these algorithms are trained and designed. Despite the often hyperbolic coverage it receives, the overarching umbrella of AI in and of itself relies heavily on machine learning and ML fairness. Marcus claims that “the logic of deep learning is such that it is likely to work best in highly stable worlds” (Marcus, 2018). However, in today’s world of fluid musical genres and especially while applying the concepts of pattern recognition, machine learning and collaborative filtering, most of the user generated data is still subjective – a microcosm of the larger sociotechnical system we live in.

Works Cited

Alpaydin, E. (2016). Machine Learning. Cambridge, Massachusetts. The MIT Press.

Ciocca, S. (2017). How Does Spotify Know You So Well? Medium. https://medium.com/s/story/spotifys-discover-weekly-how-machine-learning-finds-your-new-music-19a41ab76efe

Ek, D. (2019). The Path Ahead: Audio First. Spotify Blog. https://newsroom.spotify.com/2019-02-06/audio-first/

HBS (2018). Spotify May Know You Better Than You Realize. Harvard University. Retrieved from https://digit.hbs.org/submission/spotify-may-know-you-better-than-you-realize/

Johnson, C. (2014). Algorithmic Music Recommendations at Spotify

Johnson, C. (2014) Logistic Matrix Factorization for Implicit Feedback Data. Spotify https://web.stanford.edu/~rezab/nips2014workshop/submits/logmat.pdf

Johnson, C. (2015). From Idea to Execution: Spotify’s Discover Weekly. Retrieved from https://www.slideshare.net/MrChrisJohnson/from-idea-to-execution-spotifys-discover-weekly

Karpathy, A. (2015) “What a Deep Neural Network Thinks About Your #selfie,”  http://karpathy.github.io/2015/10/25/selfie/

Iyengar, S. (2011). How to Make Choosing Easier. TED, New York. https://www.ted.com/talks/sheena_iyengar_choosing_what_to_choose

Marcus, G. (2018).”Deep Learning: A Critical Appraisal” ArXiv.Org.

McKinsey (2013). How Retailers Can Keep Up With Consumers.
https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers

Navisro Analytics (2012). Collaborative Filtering and Recommendation Systems https://www.slideshare.net/navisro/recommender-system-navisroanalytics

Pasick, A (2015). The magic that makes spotify’s discover weekly playlists so damn good. Quartz. https://qz.com/571007/the-magic-that-makes-spotifys-discover-weekly-playlists-so-damn-good/

Ramachandran, S., & Flint, J. (2018). At Netflix, Who Wins When It’s Hollywood vs. the Algorithm? The Wall Street Journal.

Schrauwen, B. Oord, V. D. A, (2014). Deep Content-Based Music Recommendation. Ghent University https://papers.nips.cc/paper/5004-deep-content-based-music-recommendation.pdf

Zheng, Y. (2015). Matrix Factorization In Recommender Systems. DePaul University.

 

Gender Bias & Artificial Intelligence: Questions and Challenges

Gender Bias & Artificial Intelligence: Questions and Challenges

By Deborah Oliveros

  • Abstract
  • Introduction
  • A (Unnecessarily) Gendered Representation of AI
  • A ‘Knowing’ Subject
  • Replicating Human Biases
  • Challenges of Imbedded Biases in AI
  • Possible Approaches to Addressing Bias in AI
  • Conclusion
  • Bibliography

Abstract

This essay aims to analyze the different ways in which bias is imbedded in AI, machine learning, and deep learning, addressing the challenges from a design perspective to understand how these systems succeed in some respects and how they fail in others. Specifically, this paper focuses on gender bias and the significance of a gendered representation of technology in mass media and marketing as an obstacle to not only understand the process of human interaction with these technologies but to replicate historical biases present in society. Lastly, this analysis strives to present possible approaches necessary to address gender bias in AI.

  • Introduction

Over the last couple of years, prominent figures and big companies in Silicon Valley have participated in public debate around the benefits and concerns about artificial intelligence and machine learning and its possible consequences for humanity. Some embrace technology advancement openly, advocating for a non-limiting environment arguing that, otherwise, it would prevent from progress and innovation in the field. Others offer warnings similar to those of a sci-fi dystopian film; they argue that artificial intelligence could be an existential threat to humanity if -or more likely when- machines become smarter than humans. Although the potentiality of a world controlled by machines as the ones presented in The Matrix (Wachowski sisters, 1999), Blade Runner (Scott, 1982), and 2001: A Space Odyssey (Kubrick, 1968) is unsettling and borderline terrifying, there are more urgent and realistic questions to address around the issues of AI and its impact on society.

2001: A Space Odyssey (Stanley Kubrick, 1968)

Machines learn based on sets of data that humans ‘feed’ them. If machines are learning how to imitate human cognitive processes, what kind of human behaviors, social constructions, and biases are these machines picking up and replicating based on the data provided?

There is a long history of cases in which technology has been designed with unnecessarily deterministic biases on it: the famous low bridges in New York preventing minorities from using public transportation to go to the beach; the long-time perpetuated ‘flesh’ labelling of crayon colors, also on band aids, paint, and ballerina shoes; the famous case of Kodak’s Shirley Cards used by photo labs to calibrate skin tones, shadows and light during the printing process of color film, making it impossible to print darker skin facial expressions and details, among others.

Kodak Shirley card, 1974

We couldn’t expect different than the replication of these patterns when it comes to artificial intelligence and machine learning. In this case, both the design of the technology and the set of data that we are feeding into the machines are primary factors of this issue. There is a systemic, systematic, racist, sexist, gendered, class-oriented -and other axes of discrimination- bias embedded in most data collected by humans, and those patterns and principles are being picked up and replicated by the machines by design. Therefore, instead of erasing divisions through objectivity in decision making, this process is exacerbating inequality in the workplace, the legal and judicial systems, and other spaces of public life in which minorities interact, making it even more difficult to escape from it.

The data fed to the machines is diverse: images, text, audio, etc. The decision of what data is fed to the machine and how to categorize it is entirely human. Based on this, the system will build a model of the world accepted as a unique and stable reality. That is, only what is represented by the data have the meaning attached to it, without room for other ways of ‘being’ in the world. For example, facial recognition trained on data of overwhelmingly white men as successful potential candidates for a job position, will struggle to pick up others that don’t fit into those categories.

Police departments have also used data-driven systems to assess the probability of crime occurring in different areas of a city and, as it was discussed before, this data is polluted with systemic racism and class discrimination of minorities. Therefore, the immediate consequence is over policing of low-income areas and under policing of wealthy neighborhoods. This creates and perpetuates a biased cycle but, more importantly, it creates a false illusion of objectivity and shifting of responsibility from the human to the machine. Crawford says, “predictive programs are only as good as the data they are trained on, and that data has a complex history” (Crawford 2016, June 26).

  • A (Unnecessarily) Gendered Representation of Technology

There is a challenge to analyze how we perceive something that is invisible to us, not only physically but also cognitively. Two aspects need to be taken into account to get to the root of why the general public does not fully understand how these systems work: the lack of transparency from companies to reveal how these systems make data-driven decisions due to intellectual property and market competition; and the gendered marketing of these technologies to the users in combination with a gendered representation in pop culture media that is not only inaccurate but misleading. Let’s start by addressing the latter.

For decades, visual mediated spaces of representation such as movies and tv in the genre of sci-fi, have delved into topics of technology and sentient machines. Irit Sternberg states that these representations tend to ‘gender’ artificial intelligence as female and rebellious: “It goes back to the mother of all Sci-Fi, “Metropolis” (Lang, 1927), which heavily influenced the futuristic aesthetics and concepts of innovative films that came decades later. In two relatively new films, “Her” (Jonze, 2013) and “Ex-Machina” (Garland, 2014), as well as in the TV-series “Westworld” (2016), feminism and AI are intertwined.” (Sternberg 2018, October 8).

Alicia Vikander

Ex Machina (Garland, 2014)

These depictions present a gender power struggle angle between AI and humans, which is at times problematic and at others empowering: “In all three cases, the seductive power of a female body (or voice, which still is an embodiment to a certain extent) plays a pivotal role and leads to either death or heartbreak” (Ibid). This personification of AI invites the viewer to associate the technology with a power struggle that already exists in our own historical context, which in turn makes it difficult for the general public to go beyond the superficial layers of explanations of a regular tech news article that fails to address how these technologies work from a conceptual level. The over-generalizing paranoid headline seems to be catchier than an informative analysis in those cases. On the other hand, the representation of the level of agency in a female-gendered AI offers the imagined possibility that, through technology, systematic patriarchal oppression can be challenged and surpassed by the oppressed.

In spite of these manifestations of gender roles combined with AI, the reality is far from empowering: gender discrimination in algorithms is present in many spaces of social life. Even more problematic are the non-fictional representations of technology, in particular AI, as gendered.

AIs are marketed with feminine identities, names and voices. Examples such as Alexa, Siri, Cortana demonstrate this: even though they enable male identities, the fact that the predetermined setting is female speaks loudly. Another example is the female humanoid robot Sophia, developed by Hanson Robotics in Hong Kong, built as a representation of a white slender woman with no hair (enhancing her humanoid appearance) and, inexplicably, with make up on her lips, eyes and eyebrows. Sophia is the first robot to receive citizenship of any country (Saudi Arabia), it was also named United Nations Development Programme’s first ever Innovation Champion, making it the first non-human to be given any United Nations title.

Sophia The Robot.

These facts are mindboggling. As Sternberg asks, “why is it that a feminine humanoid is accepted as a citizen in a country that would not let women get out of the house without a guardian and a hijab?” (Sternberg 2018, October 8). What reaction do engineers and builders assume the female presence and identification generates during the human-machine interaction?

Sternberg says that, fictional and real decisions of choosing feminine characters are replicas of gender relations and social constructs that already exist in our society: “does giving a personal assistant feminine identity provide the user (male or female) with a sense of control and personal satisfaction, originating in the capability to boss her around?” (Ibid). As a follow up question, is that what we want the machines to learn and replicate?

  • A ‘Knowing’ Subject

Artificial intelligence (and machine learning and deep learning as subcategories) is built and designed to acquire and process human knowledge and improve its decisions on categorization over time.

Gary Marcus says, “Deep learning systems are most often used as classification system in the sense that the mission of a typical network is to decide which of a set of categories (defined by the output units on the neural network) a given input belongs to. With enough imagination, the power of classification is immense; outputs can represent words, places on a Go board, or virtually anything else. In a world with infinite data, and infinite computational resources, there might be little need for any other technique” (p. 4).

However, the data in our world is never infinite and does not necessarily have a definite and unchanging meaning or interpretation, which limits the scope of AI and machine learning and its accuracy on representing the reality of said world, “Instead, systems that rely on deep learning frequently have to generalize beyond the specific data that they have seen, whether to a new pronunciation of a word or to an image that differs from one that the system has seen before, and where data are less than infinite, the ability of formal proofs to guarantee high-quality performance is more limited” (Ibid).

As stated before, these systems will know what we teach it, and the nature of that knowledge and the power dynamics surrounding it are inherently problematic. Early feminist theorists and social critics raised questions about how the knowledge will inform the identity and ‘world view’ of the ‘knowing subject’, offering contrasting takes on gender, class and racial determinism while also presenting the possibility of “un-situated gender-neutral knowledge (“a view from nowhere”) or lack thereof” (Sternberg 2018, October 8).

Critics also pointed out how ambitious projects designed around mastering expertise and knowledge about a topic might be tainted in said ‘expertise’, taking into consideration the origin of the ‘expert’ knowledge being fed to the machines: “the role of the all-male-all-white-centuries-old-academia in defining what knowledge is valuable for a machine to master and what is expertise altogether” (Ibid).

All of these characteristics have to be put in conversation with the fact that we are at the very early stages of AI. However, even at its infancy, AI and machine learning are already impacting the way we function as a society, not only in the technological aspect but social, health, military and employment as well.

  • Replicating Human Biases

A group of researchers from Princeton University and University of Bath conducted a study in which they tested how ordinary human language applied to machine learning results in human-like semantic biases. For this experiment, the authors replicated a set of historically known biased dichotomies of different terms, “using a […] purely statistical machine-learning model trained on a standard corpus of text from the Web.” (Caliskan, A, Bryson, JJ & Narayanan, A 2017, p. 2). “Our results (fig. 1) indicate that text corpora [the machine learning system that was tested] contain recoverable and accurate imprints of our historic biases, whether morally neutral as towards insects or flowers, problematic as towards race or gender, or even simply veridical, reflecting the status quo distribution of gender with respect to careers or first names” (Ibid).

They tested various dichotomous terms that are considered systematically stereotypical, demonstrating that the terms have an underlying historical -and contextual- understanding that the machine might not be processing it as such but is replicating: “these results lend support to the distributional hypothesis in linguistics, namely that the statistical contexts of words capture much of what we mean by meaning. Our work suggests that behavior can be driven by cultural history embedded in a term’s historic use. Such histories can evidently vary between languages” (p. 9).

(Fig. 1) Caliskan, A, Bryson, JJ & Narayanan, A (2017)

Machine learning technologies are already being used in many contexts in which these biases deeply impact minorities, and specifically women. They are used for resume screening resulting in cultural stereotypes and prejudiced outcomes around gender-professions perception. Another study from Carnegie Mellon University found that women were less likely than men to be shown ads on Google for highly paid jobs (Amit Datta, Michael Carl Tschantz, Anupam Datta, 2015).

Karen Hao, for the MIT Technology Review, looks at a study performed by Muhammad Ali and Piotr Sapiezynski at Northeastern University, analyzing the impact of variations on ads in regards to their target audience based on data, finding that those variations have an impact on the audience that is reached by each ad. Unsurprisingly, the decision of who is shown each ad is biased.

Hao says, “bias occurs during problem framing when the objective of a machine-learning model is misaligned with the need to avoid discrimination. Facebook’s advertising tool allows advertisers to select from three optimization objectives: the number of views an ad gets, the number of clicks and amount of engagement it receives, and the quantity of sales it generates. But those business goals have nothing to do with, say, maintaining equal access to housing. As a result, if the algorithm discovered that it could earn more engagement by showing more white users homes for purchase, it would end up discriminating against black users” (Hao 2019, February 4).

However, Hao also explains that the problem cannot be generalized to a biased data issue, “bias can creep in long before the data is collected as well as at many other stages of the deep-learning process” (Hao 2019, April 5). She specifically refers to three stages:

  1. Framing the problem: the goal that the designer plans to achieve and its context might not take into account fairness or discrimination
  2. Collecting the data: “either the data you collect is unrepresentative of reality, or it reflects existing prejudices” (Ibid)
  3. Preparing the data: “selecting which attributes you want the algorithm to consider” (Ibid)
  • Challenges of Imbedded Bias in AI

Gary Marcus offers a very detailed critique of the field of AI in “Deep Learning: A Critical Appraisal”. His article is presented as an intentional self-introspective snapshot of the current state of deep learning. It looks not only at how much has been accomplished but also how it has failed and what that presents for different approaches to deep learning in the future.

He says, “deep learning currently lacks a mechanism for learning abstractions through explicit, verbal definition, and works best when there are thousands, millions or even billions of training examples, as in DeepMind’s work on board games and Atari. As Brenden Lake and his colleagues have recently emphasized in a series of papers, humans are far more efficient in learning complex rules than deep learning systems are (Lake, Salakhutdinov, & Tenenbaum, 2015; Lake, Ullman, Tenenbaum, & Gershman, 2016).” (p. 7)

As it was mentioned many times before, deep learning struggles to offer outputs that accurately reflect complex human concepts that are difficult to represent as computational, as a set of yes and no answers. It can go from translation of languages, to more abstract concepts such as justice. Referring to my personal favorite example of open-ended natural language, Marcus says, “In a problem like that, deep learning becomes a square peg slammed into a round hole, a crude approximation when there must be a solution elsewhere.” (p. 15)

Another observation present in Marcus’ analysis refers to the approach of a real world taken as a ‘set in stone’ reality: “deep learning presumes a largely stable world, in ways that may be problematic: The logic of deep learning is such that it is likely to work best in highly stable worlds, like the board game Go, which has unvarying rules, and less well in systems such as politics and economics that are constantly changing.” (p. 13)

Not only the world and our knowledge of it is constantly changing, but our representation of that reality through data is most of the times inaccurate at best, skewed at worst. To what extent and what are the different ways in which we can see the impact of such flawed outputs? Sternberg presents two aspects to take into consideration:

  • What exists in the data might be a partial representation of reality:

Even that partial representation might not be entirely accurate. For example, the previously mentioned case of Kodak’s film being unable to efficiently capture non-white tones of skin is also present in facial recognition systems. Other more recent controversial cases are of systems mistaking pictures of Asians as ‘blinking’ and identifying black people as gorillas. The social cost of a mistake in any AI system being used by the police for decision-making is higher and more likely to present results that are less accurate with minorities since they were underrepresented and misrepresented in the data-set: “This also calls for transparency regarding representation within the data-set, especially when it is human data, and for the development of tests for accuracy across groups” (Sternberg 2018, October 8).

  • Even if the data does represent reality quite truthfully, our social reality is not a perfectly-balanced and desired state that calls for perpetuation:

Gender and racial biases present in the binary terminology is, after all, based on statistics present in the off-line world, well-documented in history. However, here Sternberg presents an optimistic perspective, “our social reality is not a perfectly-balanced and desired state that calls for perpetuation” (Ibid). This meaning that we are giving a deterministic characteristic to the data in this process when these are not the ideas, concepts and human values we should be preserving or basing our technology on.

In regards to that, Sternberg criticizes the absolute faith in the outcome of these systems regarding them as more objective than humans: “Sexism, gender inequality and lack of fairness arise from the implementation of such biases in automation tools that will replicate them as if they were laws of nature, thus preserving unequal gender-relations, and limiting one’s capability of stepping outside their pre-defined social limits” (Ibid).

Marcus’ premise agrees with Sternberg, focusing more on the problem of thinking about deep learning as the only tool available to understanding and digitalizing the world when, in reality, this tool might not fit every problem we want to fix: “the real problem lies in misunderstanding what deep learning is, and is not, good for. The technique excels at solving closed-end classification problems… And some problems cannot, given real world limitations, be thought of as classification problems at all.” (p. 15)

What is most thought-provoking about Marcus’ article is the proposal to see deep learning beyond this set box in which every human problem/though must be filtered through. To understand that we have to develop other hybrid ways in which we can analyze these problems beyond classifications instead of trying to make the “square leg fit into the round hole” (p. 15).

  • Possible Approaches to Addressing Bias in AI

Now we’ll address the remaining challenge presented a few sections above: it is difficult to analyze our perception and consequently actions regarding something that is invisible to us. The lack of transparency from companies to reveal how these systems make data-driven decisions due to intellectual property and market competition, is one of the main reasons why we don’t have access to this knowledge.

However, even if companies were coerced into sharing this information with the general public or authorities, the reality is that artificial technology is not only extremely young but evolving as we speak. Therefore, it can be said that the use of these technologies is both a process of creation and discovering at the same time. Based on what is public, engineers don’t fully know or understand how artificial technology learns and evolves with time. And whatever they know, they are not willing to share because of the conditions of the market in which they operate.

Crawford explains in regards of the case of women not seeing ads for high-paying jobs, “the complexity of how search engines show ads to internet users makes it hard to say why this happened — whether the advertisers preferred showing the ads to men, or the outcome was an unintended consequence of the algorithms involved. Regardless, algorithmic flaws aren’t easily discoverable: How would a woman know to apply for a job she never saw advertised? How might a black community learn that it was being overpoliced by software?” (Crawford 2016, June 26).

In terms of social actors that are invested and can influence how these technologies are managed, we can find that governments, NGOs and other entities have a stake into the outcomes of artificial intelligence and machine learning. Unfortunately, in the environment that was previously described of lack of information, they all pretty much operate ‘in the dark’ or, at least, at various levels of ‘darkness’.

A great example of how unprepared our politicians are to deal with this reality and attempt to hold tech companies accountable, happened a few months ago in the House Judiciary Committee. During the hearing of Google CEO Sundar Pichai, the members of the committee spent more time on passive-aggressively asking embarrassingly ignorant questions, with a clear partisan tone, than asking urgent, and appropriate questions around Google’s data policies and privacy practices. At one point, Pichai had to repeatedly explain that iPhone was a product of Apple, a different company than Google, and the collective groan of humanity could be heard across the globe.

It is clear that regulation and outsider audits are necessary to address the issue of gender bias in AI. However, it seems unlikely that something even remotely close to a proposal will make its way to congress anytime soon, let alone pass as a bill. Therefore, there is a need to find alternative ways in which actors can collaborate and share information towards the common goal of fixing and preventing the perpetuation of historical bias in AI. Evidently, the ones who have more possibilities of enacting a change are the engineers and companies themselves.

The authors of the language-based study from Princeton University and University of Bath offer: “we recommend addressing this through the explicit characterization of acceptable behavior. One such approach is seen in the nascent field of fairness in machine learning, which specifies and enforces mathematical formulations of non-discrimination in decision-making (19, 20). Another approach can be found in modular AI architectures, such as cognitive systems, in which implicit learning of statistical regularities can be compartmentalized and augmented with explicit instruction of rules of appropriate conduct (21, 22)” (Caliskan, A, Bryson, JJ & Narayanan, A 2017).

However, how can a solution such as this one be enforced and regularly supervised? We need organizations that address issues of technology and human rights to serve as intermediaries with the companies and the civil society, as they have done in the past since the creation of the internet.

If machines are going to replicate a human, what kind of human do we need them to be? This is a more present and already underway threat than a dystopian apocalypse in which humanity is decimated by their own creation, the Frankenstein old tale. As Kate Crawford wrote in the New York Times, the existential threat of a world overtaken by machines rebelling against humans might be frightening to the male white elite that dominates Silicon Valley, “but for those who already face marginalization or bias, the threats are here” (Crawford 2016, June 26).

 Conclusion

            Gender bias in AI, machine learning, and deep learning is the result of the replication by design of a deeply systemic, systematic, racist, sexist, gendered, class-oriented -and other axes of discrimination- bias embedded in most data collected by humans. Instead of erasing divisions through objectivity in decision making, this process is exacerbating inequality in the workplace, the legal and judicial systems, and other spaces of public life in which minorities interact. This happens in combination with an inaccurate and gendered representation of technology both in pop culture media as in marketing, making it more difficult for the general public to become aware and understand how these technologies work and their impact in our day-to-day lives. Bias can be introduced in the process by how the problem is framed, how the data is collected, and what meanings are attributed to that data (Hao 2019, April 5). Fixing gender bias in AI is a complex issue that requires the participation of all stakeholders: the companies, the designers, the marketing teams, tech reporters, intermediary collective organizations that advocate for civil society, and politicians. However, the major challenges boil down to the lack of transparency on how these systems make decisions and regarding them as the only filter through which every abstract human problem can be solved.

Bibliography

Machine Learning & Algorithmic Music Composition

Abstract

Machine learning and algorithmic systems has not been a foreign application process in the field of music composition. Researchers, musicians, and aspiring artists have used algorithmic music composition as a tool for music production for many years now, and as technology advances, so do the understandings of the art that algorithms output and the implications that come along with it. Who owns the art? Is it creative? This research paper explores the way machine learning and algorithms are used to implement and utilize music compositional systems, as well as the discourse that both exists and will exist in the coming years due to the accessibility of this technology. Case studies will be examined to narrate the support and disapproval of algorithms for music composition, such as Magenta’s NSynth system and Amper’s system.

Introduction

The process and study of algorithmic music composition has been around for centuries, for algorithmic music is understood as a set of rules or procedures followed to put together a piece of music (Simoni). These algorithms can be simple or complicated — they are meant to be manually predictive styles of music composition. However, fairly new research regarding the computational, algorithmic machine learned process of music creation has been prevalent. How is machine learning applied to the field of music production? For the sake of this research paper, any concepts regarding music theory and styles of music/music genres will be avoided for our discourse primarily because the discourse being expanded upon is algorithmic music composition in relation to technology. Algorithmic composition is made up of methodical procedures through computer processing, which has made algorithms in musical contexts more sophisticated and complex (“Algorithms in Music”).

Technical Overview

Though the functionality of algorithmic music composition systems differ based on the utility of the technology (e.g. a tool for creation v. a system that generates a piece at random), they all share the same internal inner system. Machine learning is defined as a set of techniques and algorithms that carry out certain tasks while being housed inside of the artificial intelligence it’s designed to be in. Machine learning researchers are primarily interested in understanding the knowledge about data-driven algorithms. In relation to technological-algorithmic music composition, defined as the creation of methodological procedures (“Algorithms in Music”), data is collected from hundreds of types of music and coded into different categories for a more organized and automated flow of data retrieval on the machines end once an input is requested for the machine to output content. This process can be identified as classification, a type of algorithm that classifies features of data into categories. If a particular algorithm were to be implemented for an artist to use technology as a tool to create a machine-generated melody, the algorithm begins to look through the scanned and classified data it has on melodies and begins to produce a melody that not only borrows sonic elements from the data is has, but also is sonically representative as a combination of the data is has learned.

The process of collecting data is called a generative adversarial network, which is a deep neural network made up of two other networks that feed off of one another (“A Beginner’s Guide to GANs”). One network is called the generator which generates the new data, while the other network is called the discriminator (part of the discriminative algorithm) which evaluates an input for authenticity (“A Beginner’s Guide to GANs”). The generating network begins to produce an requested content (in this case, a musical component) at random, in which, soon enough, the discriminator network begins to feed data into the generating network by critiquing what’s being produced (“A Beginner’s Guide to GANs”). From there, the generating network fine tunes what is being generated until the discriminator network lessens the amount of critiques it feeds, which suggests that the generating network has produced something well-bodied enough for the discriminating network to identify it as a creative work of art (“A Beginner’s Guide to GANs”).

 

Functionality and Utility

Wave animation

Visual of raw audio data and the hundreds of components that make up each sonic component within a 1 second audio file (“WaveNet”)

As an application to automated music production, there has been heavy discourse regarding GANs and the approachability of GANs in research specifically in regards to audio, in comparison to, more commonly, digital images (“The NSynth Dataset”). Audio signals are harder to code and classify based on the properties that make up sound — previous efforts to synthesize data-driven audio was limited by subjective forms of categorization such as textural sonic make-up or training small parametric models (“The NSynth Dataset”). Researchers at Magenta have partnered with Google to create NSynth, an open-source audio data set that contains over 300,000 musical notes, each of which are representative of different pitches, timbres, and frequencies (“The NSynth Dataset”). The creation of NSynth was Magenta’s attempt at making audio dataset retrieval (GAN’s) as approachable and accessible as possible, without the technical limitations (“The NSynth Dataset”). By having this technology more accessible, the developers at Magenta were also developing news ways for humans to use technology as a tool for human expression (“NSynth”). NSynth uses deep neural networks to create sounds as authentic and original as human-synthesized sounds by mimicking a WaveNet expressive model — a deep learning generative model of raw audio waveforms which generates sound (speech or music) that mimics the original source of sound. WaveNets work as a convolutional neural network where the input goes through various hidden layers to generate an output as close as possible to the input (“WaveNet”).

Architecture animation

Demonstration of WaveNet system of inputting and outputting media, (“WaveNet”)

Magenta considers their inspired-WaveNet as a compression of original data, whose output is as similar as the input (“NSynth”). Below are images provided by Magenta that demonstrates the process of inputted audio becoming coded, classified, and compressed, and then outputted as a reconstructed sound that resembles the original input:

The process of GAN’s at work to reproduce an inputted sound, (“NSynth”)

Here is a clip of the original bass audio (“NSynth”).

In this audio clip, the original bass audio is embedded, compressed, and then reconstructed as the following output (“NSynth”).

Algorithmic music composition can be used in various ways, similarly to how other forms of art can be produced by technology depending on the implementation of the technology for content creation. Some algorithmic music composition systems can be used as a tool to create sounds generated from a pool of data that reflects other sounds, similarly to the way way Magenta’s NSynth works, while other algorithmic music composition function as stand-alone systems synthesized to output an entire generated track. Researchers at Sony used Flow Machines software, which houses 13,000 pieces of musical track data, to create a song that mimics the work of The Beatles (Vincent). The track “Daddy’s Car” (link below), was fully produced by composer Benoit Carre who inputted the desired style of music and created the lyrics to the track (Vincent). Though multi-layered and complex, Sony’s experiment holds a revolutionary standard for the better and worse of algorithmic music composition. The bettering of it would be that algorithmic music composition can be demonstrated as a machine-capable music creator. However, legal questions come into play regarding rights/copyright and artistic autonomy. The ability for AI to generate a track from scratch simply by requesting a particular type of genre brings in various types of data that the machine has learned threatens the sanctity of artists losing rights to their song, let alone being unable to determine the origins of the borrowed/influenced musical components of the newly composed AI track.

In regards to the generative adversarial networks that algorithmic music compositional systems run off of, there comes a point in time where the data being collected in the machine learning and classification process is only representative of music that’s similar to one another. In other words, a computer scientist can generate an algorithm for AI to produce a song, but only feed that AI musical data from 1,000 songs all of which are sonically similar. Even if 1,000 songs are randomly selected for classification and deep learning, and all those songs are fairly familiar in genre, whatever song is machine-produced would only reflect what the generator of the GAN’s thinks is music — and that notion of what “is” music would be based off of what the network commonly notes as the structure of music. More avant-garde tracks won’t be representative of what algorithmic music composition systems are capable of. With this understanding, the general ‘standard’ for what music really is, based on how the generator in the GAN functions, will be set and will produce tracks that might be overall similar to one another. This is evident not only by how algorithmic music composition systems are too generic and lack substance (Deahl), but also based on how more avant-garde/non-generic tracks are more creative and multi-dimensional in terms of the quality of art and the meaning behind it.

Ethical Considerations

Many ethical considerations surround the discourse of creativity and artificial intelligence. Art is and has been considered a personable and human-to-human experience where one artist creates a body of work to express an idea or emotion to their audience. From there, the art begins to find its place and exist in a broader cultural context where it provides meaning and metaphorical reflection. This is creativity — defined as implementing original ideas based on one’s imagination. The term “imagination” generally denotes to the human experience and the human mind. Once technology is brought into the mix of artistic creation and creativity, regardless of whether technology is used as a tool for the artist or used as a stand-alone machine that generates content through the use of algorithms, cultural discussions outside of the realm of computer science can begin to threaten and question whether technology-influenced art is “creative” or “art.” Much discussion around the world of AI and machines already involves misunderstandings and misconceptions of those technologies, where a narrative of AI being autonomous beings that will take over the human race is portrayed in the media from people that are unaware of the development process of AI and how it functions. Though the threat of automation does exist, people fail to understand that AI and machines are not autonomous. They are not self-thinking humans/beings. The actions performed by machines are a product of human development and language/actions that are coded into algorithms as a reflection of the human experience. Perhaps it’s that people are afraid to approach the fundamental aspects of artificial intelligence, or that de-blackboxing artificial intelligence is unappealing. However, in the past 20+ years, our knowledge of natural language processors, machine learning, and artificial intelligence as expanded at such an exponential rate that it seems as though we’ve come to an intersectional point in time where society is now trying to catch up to speed with the development and growth of technology.

Computer generated music is a product of humans, not the machines themselves because machines are not autonomous self-thinking beings. Through generative adversarial networks, artificially intelligent machines learn to understand and classify the data so that once an input is given from a human, they can produce/reproduce whatever action is being desired. However, if the machine isn’t responsible for creating the music, who is? Computer scientists are not the ones actually producing the data that is being fed as algorithms to the computer – they merely create the algorithms. Nor is anyone able to cite music that a machine generates through the use of algorithms because what’s generated is an inspired mix of data from other artists work – hundreds and hundreds of them. The criteria for creativity and aesthetics, especially in the art world, is subjective to the artist and audience (Simoni). Earlier system models of generative algorithmic compositions are outnumbered by later and more recent system models of generative algorithmic compositions due to how these systems are dulling the creative process (Simoni). The dulling of creativity is an ethical debate surrounding the art realm, where people are not only fearful that machine-composed art will oversaturate the art realm, but also lower the expectations of creativity itself. Already, when reviewing AI generated bodies of work, if told that the body of work is AI produced, art critics highly criticize it calling it one-dimensional, boring, and unimaginative as if it were a knock off of already existing artists (GumGum Insights). Projects such as GumGum’s self-painting machine sparked a lot of controversy over how creating an image through a collection of data in the GAN can be considered “creative,” suggesting that it’s not at all creative because there was no artist directly involved in the creative process (“ART.IFICIAL”). Another consideration was the lack of source credibility with citing where the artistic inspiration is from. It is not only hard to determine what work(s) influence the outputted creative work of a machine, but also hard to determine how much of whose work was an influential source to create the GAN-generated art. The same arguments can be made in relation to algorithmic musical composition. Depending on the algorithm being implemented to produce musical content, it’s evident that there is a pool of data collected and classified for the generative adversarial network to work off from the data it has to create music. Technically speaking, the collected data is visible and citable, however the output of algorithmic music composition is not entirely traceable back to its original source. Music production computer softwares like Logic are already both readily accessible to consumers and allows for producers to generate auto-populated drum patterns that are unique to each and every user, thanks to artificial intelligence’s deep neural networks which relies on large amounts of data to output what’s being requested by the user (Deahl).

For the many arguments made against using algorithmic composition, there are many arguments made in support of it. Many advocates urge to use algorithmic music composition programs as tools to both enable artists to create more and make music composition more accessible for non-musicians (Deahl). Amper, an AI algorithm music composition tool that generates music, is easier than Google’s NSynth machine and allows for music to be automatically generated in less than three manual commands through a generative adversarial network that creates a unique sound each and every time (Deahl). Many artists such as Taryn Southern use tools like this to produce meaningful music rather than to harm the art industry by letting machine powered algorithms draw in musical inspirations from its data to produce a unique track (Deahl). Southern’s practices for music production ties into the discourse surrounding remix practices — what about a remix is original or stolen? At what point does a publisher of an art piece have to cite/source the original content it’s drawing inspiration from? In regards to algorithmic music composition, should we and how should we cite from the inspired sources? Should there be a way for AI-generated music to be identifiable? With quick developments in technology, there is a possibility that a new standard of music production will be created in regards to the awareness (or lack thereof) of algorithmic music composition.

Conclusion

Algorithmic music composition is a tool that has been around for quite some time. It’s the use of algorithmic music composition systems, as a tool or as a stand alone music production machine, that are rapidly evolving in time. These systems primarily function as general adversarial networks in which a gathered amount of data is learned as a natural language processor for the system. From there, once an input is requested (e.g. “produce rock music”), the system identifies previously classified sounds coded as “rock music,” compresses it, and the outputs a sound that borrows from the copious data it has learned and stored. Current research on the use of algorithmic composed music demonstrates that there are positives and negatives to such systems — it allows for algorithm music composition to become a tool for creative expansion and accessibility for aspiring artists, however it also hinders creative development by limiting its source credibility and sonic uniqueness. Machine learning and algorithmic computational systems are embedded in the process of algorithmic music composition, however the ongoing debate on whether the work it produces is creative or not will remain a subjective debate until legal precautions are carried out to bring clarity to who owns AI-influenced music. 

Work Cited:

“A Beginner’s Guide to Generative Adversarial Networks (GANs).” Skymind, https://skymind.ai/wiki/generative-adversarial-network-gan.

“ALGORITHMS IN MUSIC.” NorthWest Academic Computing Consortiumhttp://musicalgorithms.ewu.edu/musichist.html.

“ART.IFICIAL: How Artificial Intelligence Is Paving the Way for the Future of Creativity.” Gumgumhttps://gumgum.com/artificial-creativity.

Deahl, Dani. “HOW AI-GENERATED MUSIC IS CHANGING THE WAY HITS ARE MADE.” The Verge, 31 Aug. 2018, https://www.theverge.com/2018/8/31/17777008/artificial-intelligence-taryn-southern-amper-music.

“NSynth: Neural Audio Synthesis.” Magenta, 6 Apr. 2017, https://magenta.tensorflow.org/nsynth.

Simoni, Mary. “Chapter 2: The History and Philosophy of Algorithmic Composition.” Algorithmic Composition: A Gentle Introduction to Music Composition Using Common LISP and Common Music, MI: Michigan Publishing, 2003, https://quod.lib.umich.edu/s/spobooks/bbv9810.0001.001/1:5/–algorithmic-composition-a-gentle-introduction-to-music?rgn=div1;view=fulltext.

“The NSynth Dataset.” Magenta, 5 Apr. 2017, https://magenta.tensorflow.org/datasets/nsynth.

Vincent, James. “This AI-Written Pop Song Is Almost Certainly a Dire Warning for Humanity.” The Verge, 26 Sept. 2016, https://www.theverge.com/2016/9/26/13055938/ai-pop-song-daddys-car-sony.

“WaveNet: A Generative Model for Raw Audio.” DeepMindhttps://deepmind.com/blog/wavenet-generative-model-raw-audio/.

Chatting their Way to High Brand Equity

Dominique Haywood

CCTP 607 Final Project

 

Abstract:

In many technology communities, 2016 was known as the year of the chatbot. Facebook released an API that made making branded chatbots extraordinarily simple for brands large and small alike  (Constine, Josh. 2016). Microsoft released then quickly silenced their chatbot, Tay, after it learned white supremacist rhetoric from Twitter users (Victor, Daniel 2016). For better or for worse, brands now have a new direct communication channel to manage customer service, target marketing and facilitate sales. Both customers and businesses like chatbots for one main reason: chatbots makes customer service less arduous. Customers have 24-hour access to brand representatives that they don’t have to call, and businesses have a simple solution for customer engagement that can be designed to meet the business’ standards. This paper will analyze the history of chatbots, the technology that drives chatbots and how the design of chatbots impacts the brand equity of those who use them.

 

Introduction:

 

Since the launch of Facebook Messenger chatbots in 2016, companies have quickly taken advantage of chatbots as a new communication channel to customers. Chatbots are interactive digital agents which provide real time conversational interfaces for organizations.  There are currently over 30,000 chatbots active on Facebook messenger and it is expected 80% of customer engagement will be done through chatbots by 2020. Utilizing chatbots is a way for business to provide consistent and reliable customer service and has already proven to be successful by companies like 1800 Flowers and Tommy Hilfiger which have experienced monetary returns on chatbot investments.  This new communication channel has also benefitted customers by providing 24-hour access to answers from brands; 77% of customers who have interacted with a business’s chatbot have reported improved perception of that business (c Wertz, Jia 2016). Chatbots are simple solutions to many customer service complaints, but come with their own technical and security challenges. Through an analysis of the design of chatbots and the history of the technology, I will assess how the implementation of chatbots can impact an organization’s brand equity using the Customer Based Brand Equity ( CBBE) framework.Usually, this framework is used to help an organization analyze their current brand and build a stronger, customer focused one step by step. In this paper, however, I will use the framework to show how four brands have utilized chatbots to impact a stage of the CBBE framework.

 

 

What are Chatbots?

Chatbots are known by many different names including chatterbots, interactive agents, conversational AI or artificial spy entities. Despite the multitude of names, all chatbots effectively perform the same task: conduct conversations using natural language by following designed protocols. Most modern day chatbots are designed with AI to accurately process and respond to human inputs, whether the input is vocal or textual. Other chatbots, especially the earlier ones, are designed to simply follow a set of rules that produce replies from a designed script. This paper will focus on the chatbots that are designed with AI and operate on chat websites and messaging applications, however, the moniker “chatbot” can be applied to virtual assistants and automated voice chatbots (Wikepedia.com).

 

A Brief History of Chatbots

Although chatbots seems like a relatively new phenomenon, chatbot applications have been around since the early 1960’s. ELIZA, designed in 1964, was a computer program developed by Joseph Weizenbaum in the MIT Artificial Intelligence Laboratory. Running on a script called DOCTOR, ELIZA was designed to mimic the retorts of a Rogerian psychotherapist to a client on his or her first visit. ELIZA was initially designed to reveal the superficial nature of human and computer interactions, but through usage unveiled emotional attachment developed by ELIZA users( (Weizenbaum, Joseph. 1966).

The emotional attachments formed primarily because of the responses from ELIZA enabled by the DOCTOR script. The DOCTOR script processed the inputs from users through simple characterization and substitution of terms to deliver predesigned templates and encoded phrases, which were created to parody a Rogerian therapist’s penchant for answering questions with more questions. By designing ELIZA with an if-else protocol, Weizenbaum was able to avoid designing a complex natural language processing system (Weizenbaum, Joseph. 1966). The difference between ELIZA and the 30,000 chatbots on Facebook Messenger is that chatbots used today are designed and trained with the knowledge of language.

Another notable bot is called Alice (Artificial Linguistic Internet Computer Entity) or Alicebot, it was inspired by ELIZA and was created by Richard Wallace 1995 using Java. Though Alice is a complex and award winning chatbot, it still cannot not pass a Turing test (Wikipedia.com). A turing test is a test of a machine’s intelligence and whether or not a machine can pass as a human.

The first chatbot to pass a Turing test was Eugene Goostman in 2014. The Russian chatbot was designed to communicate like a 13-year-old Ukrainian whose first language was not English. (Gonzalez, Robert T. 2015). The validity of this milestone has been fervently questioned, this debate is fueled not by the advanced technological nature of Eugene Goostman, but the believability of Eugene Goostman’s personal history. Passing the Turing test brings forth questions about how users perceive a person or bot to converse and the accuracy of these perceptions. Other notable chatbots include Clippy, the loathed bot that peppered the margins of Microsoft Word from 1997 to 2003.

Eugene Goostman like ELIZA was designed with a specific backstory that fostered trust in the some of the humans that the bot interacted with. ELIZA was not said to be more than a computer program, yet the responses designed for ELIZA still evoked a one sided relationship from humans. These two bots are prime examples of how brands can overcome subpar bots with realistic “personalities”. The personalities designed into the branded bots provide users with trust and familiarity which may, inadvertently, deceive humans and engender trust. Depending on the brand’s identity, the branded bots should utilize colloquialisms or vernacular to convey the bot’s persona and align it with the brand. Successfully designing a bot with a personality requires training the bot to have a knowledge of language; this can be done through AI, specifically natural language processing and natural language understanding.

 

How Chatbots Work

 

Chatbots are designed like applications with multiple layers of functionality including the presentation layer, machine learning layer and the data layer. Natural Language Processing, Natural Language Understanding and Natural Language Generation are designed to facilitate accurate responses to queries by sending data through the layers of the chatbot (Figure 1). Natural Language Processing (NLP) is the overarching system of neural networks that facilitate end to end communication between humans and machines, in the human’s preferred language. Essentially, NLP provides the machine with the knowledge of language that is used by human interlocutors. (Chatbots Magazine 2018). Chatbots designed with NLP converts a user’s message to structured data, so that a relevant answer can be produced. Natural Language Understanding (NLU) is designed to manage the unstructured and flexible nature of human language. Designing NLU for chatbots requires a combination of rules and statistical models to create a methodology for handling unknown or unrecognizable inputs (Lola.com 2016) At its core, NLU gives chatbots the ability to accurately process and respond to colloquialisms and quirks of human language. Natural Language Generation (NLG) creates the message for the chatbot to answer to the original query.

Figure 1: Fernandes, Anush. “NLP, NLU, NLG and How Chatbots Work.” Chatbots Life, Chatbots Life, 15 Nov. 2017, chatbotslife.com/nlp-nlu-nlg-and-how-chatbots-work-dd7861dfc9df.

 

Popularity of Chatbots

Over the last five years, the interest in chatbots has increased exponent ally (Figure 2). This interest is due to several factors, including the simplicity with which chatbots can be integrated into mobile devices; they share a core feature: messaging. Of the 7.3 billion people in the global population, 6.1 billion people use an SMS capable mobile device and 2.1 billion use messaging applications. Facebook Messenger has 1 billion users, solidifying chatbots as an easily integrated and not disruptive mobile technology (Wertz, Jia. 2018).

Another reason for the wide spread interest and adoption for chatbots is that they are designed to be used by people of all age groups. This makes the technology less prohibitive than other more complex technologies. Chatbots also provide customers with opportunities to ask questions, they otherwise would be embarrassed to ask (Brandtzaeg P.2017) Overall, chatbots are effective because they are simple to engage with and relatively easy to design.

William Miesel, a renowned technologist in the chatbot world, has predicted that the global revenues from chatbots will  soon amass to $623 billion (Dale, Robert.2016). In the global market, 45% of end users prefer to engage with chatbots for customer service inquires. Results from a survey conducted by Live Person ,with 5,000 respondents, showed that the majority of users are indifferent to chatbots, as long as the user’s problem is resolved following a conversation with one. The second largest group of respondents (33%) felt positively about chatbots. (Nguyen, Mai-Hanh. 2017). Automating customer service through chatbots allows business to gain consistency and speed, which is often lacking in human customer service. Studies have shown that the majority of customers have improved perceptions of a brand after conversing with branded chatbots (Wertz, Jia. 2018). Gartner predicts that by 2020, 85% of customer engagement will be done through non-human entities (Moore, Susan.2018)

Businesses benefit from chatbots by cutting costs and gaining knowledge about consumer behavior. In a survey, conducted by Oracle in 2016, 80% of respondents said that they have used or plan to use chatbots by 2020. The increased integration of chatbots into businesses follows the trend of increased automation, chatbots will soon be used across businesses in marketing, sales and customer service. Though complete automation through chatbots is not feasible or even holistically beneficial for an organization, chatbots designed for customer service are predicted to replace 29% of customer service jobs. (Business Insider 2016)

 

Figure 2: Raj-Building Chatbots with Python-Using Natural Language Processing and Machine Learning (2019).pdf

Chatbots and Data Security

 

Chatbots, like all technology, have risks for users, especially as it relates to data privacy. Personalized chatbots, in particular, need to be designed with safeguards for data. Without proper security measures in place, both businesses and users can suffer. Chatbots rely on HTTP and other communication protocols, as well as, SQL queries for data retrieval which can often be targeted for hacks (Bozic J. 2018). Most concerns about data privacy in the chatbot world are focused on financial services chatbots. However, most financial services already transfer user data from databases via HTTPS protocols. (Chatbots Magazine. 2017)

There are two methods which are normally designed into chatbots to ensure security: authentication and authorization. Authentication verifies the human’s identity and authorization gets the user’s permission to complete a task. The technology used to develop chatbots is not new technology, which means that there are existing security measures that have been designed to combat security threats. However, it is important to remember that data security is the onus of the developers as well as the platforms that the bots run on (Chatbots Magazine. 2017).

 

Brand Equity and Branded Chatbots

 

According to Keller’s Customer Based Brand Equity model (CBBE) (Figure 3), brand equity is composed of four major parts including the identity, meaning, response and relationships. Brand equity, in this context, represents the valuation of a brand in the eyes of the customer. High brand equity represents strong customer loyalty and can protect a company from volatility in the market (Keller, Kevin Lane).

Figure 3: “Keller’s Brand Equity Model Building a Powerful Brand.” Strategy Tools from MindTools.com, www.mindtools.com/pages/article/keller-brand-equity-model.htm.

 

Salience

Building brand salience is integral to defining brand identity and engendering brand awareness. A brand’s identity is composed of more than just the logo and name, several other decisions about the appearance of  the brand, including packaging, font, and color scheme of the brand impact its identity (Forbes.2017). Brand awareness is more than a customer’s recognition of a brand’s name and logo, it is also the connection between a brand’s product and the needs that it will fulfill for a customer. Essentially, brand salience is what a company wants consumers to think about the brand when the consumer needs a product in the brand’s category. Well-designed brand salience helps companies stand out in their industries and is integral to brand equity. (Keller, Kevin Lane).

The floral industry is struggling to maintain profitable and industry tactics aimed at averting heightened preferences for succulents have birthed several flower delivery startups. As the industry wavers, brands like 1800 Flowers are attempting to maintain their salience by staying ahead of the technological curve of e-commerce (Kelleher, Katy. 2018) 1800 Flowers launched one of the first chatbots on Facebook Messenger in 2016. Its success was present nominally and practically with Mark Zuckerberg demoing the 1800 Flowers bot at the F8 conference and the new customers and sales gained from the bot. Not long after launch, President of 1800 Flowers, Chris McCann lauded the bot for its facilitation of 70% of sales and its acquisition of a younger market (Caffyn, Grace, et al. 2016). Historically, 1800 Flowers has had a relatively high percentage of millennial customers, 29% as of 2014. This is due to the organization’s commitment through development of brand salience, marketing and strategic partnerships to keep this younger market purchasing items that millennials characteristically do not (Stambor, Zak. 2017). Consistency is integral to salient brands because it helps to keep the brand’s place in the market stable over time, but the ability to change is also vital to keeping the brand relevant.

Meaning

Meaning is how a brand communicates with its customers and conveys the ethical values of the brand to the customers. This is one clear area where a well-designed chatbot can effectively impact a brand’s equity. Meaning is comprised of two components, imagery and performance. Imagery is how a brand satisfies a customer’s social and psychological expectations of the customer, this can be done through digital targeted marketing or physical engagement with a customer in store. Performance is how well a brand meets a customer’s needs (mindtools.com).

Despite the growth of online sales, customers still spend more on in store purchases than online ones. Chatbots not only provide opportunities to market and provide customer service, but can also be used to drive customers back into the store. Regardless of the type of business, companies need to strike a balance between online presence and in store experiences to stay competitive. Tommy Hilfiger launched a chatbot on Facebook messenger called TMY.GRL to converse with customers about the Fall 2016 Tommy X Gigi Hadid collection (Figure 4). At its earliest launch, TMY.GRL was an informational chatbot which provided users with product suggestions and information. With further integration of e-commerce on Facebook Messenger, TMY.GRL is able to link product information with purchase opportunities. TMY.GRL provides solutions for both imagery and performance for Tommy Hilfiger by providing an entertaining commerce channel that aligns with the brand identity (Arthur, Rachel. 2016). TMY.GRL meets the expectations and needs of customers by providing speedy and relevant suggestions to customers looking for products. It also balances online presence with in store engagement by alerting customers of sales and news of events.

Figure 4: Arthur, Rachel. “Tommy Hilfiger Launches Chatbot On Facebook Messenger To Tie to Gigi Hadid Collection.” Forbes, Forbes Magazine, 12 Sept. 2016, www.forbes.com/sites/rachelarthur/2016/09/11/tommy-hilfiger-launches-chatbot-on-facebook-messenger-to-tie-to-gigi-hadid-collection/#3a5a42ab2238

Response
Response is how consumers respond to engagement with the brand, these responses can be categorized in two segments: judgements and feelings. Brand judgements are the assumptions that customers make based on the performance and imagery of a brand. Judgements generally fit into four major categories: brand quality, brand credibility, brand consideration and brand superiority. Brand feelings are related the social currency of the brand that sometimes conjure emotional responses or reactions. The six brand feelings are warmth, fun, excitement, security, social approval and self-respect. Brand response encompasses the head and heart reactions that a customer experiences when interacting with a brand or makes a purchase from that brand (Keller, Kevin Lane).

Cleo is a financial services chatbot designed to replace individual banking apps. Users can get insights about spending habits and trends across multiple debit and credit accounts. Rather than becoming a bank, the founder of Cleo is committed to improving the user experience of financial services (O’Hear, Steve. 2018). Cleo is designed with a sassy, comedic personality which was designed to take the formality out of banking and make millennials more comfortable communicating with Cleo. This decision is clearly aimed at inducing warmth and fun from users, which are emotions typically left out of banking. Cleo also emphasizes the security of the business to give credibility to the bot. The personality of Cleo, however, has come under fire for being too informal and making inappropriate jokes (Sentance, Rebecca). Since Cleo bot is central to the Cleo start up, any negative reflection on the bot impacts the overall business. Though the informal nature of the bot is meant to make the brand relatable, it is clearly a risky business to design a bot, the business’ only tangible product, with a tongue in cheek “personality”. The judgements that result from that kind of personality can leave customers with positive or negative opinions which could be dangerous for the brand’s equity.

Relationship

The last and arguably most important part of CBBE is the relationship between brands and their customers: brand resonance. Brand resonance represent the extent to which customers feel that they are in sync with the brand. It can be measured by repeat purchases and frequent engagement with the brand. The four categories of brand resonance are behavioral loyalty, attitudinal attachment, sense of community and active engagement (Keller, Kevin Lane).

Wysa, Woebot and Youper are three of the top therapy chatbots currently on the market. A study from 2014, discovered participants were more open with AI psychologist bots than with human bots. These findings are not particularly surprising, especially because of the historical experiences with ELIZA. The penchant for consumers to utilize bots for therapy, highlights the many prohibitive factors of human to human therapy: cost, time, and general lack of access.  It also fosters attitudinal attachment to less expensive, yet still effective chatbots. Nevertheless, there are ambiguities about the safety of mentally ill patients who use bots for therapy rather than human therapists.  (Mikael-Debass, Milena). Therapeutic chatbots have an advantage in brand equity over other kinds of chatbots because therapy is deeply relational. E-commerce, retail and banking chatbots replace the semi-anonymous customer service that consumers experience in-store, online and on the phone. Therapy, however, is done one on one or in groups where there are relationships formed between participants based on the expectation of trust and privacy.  The nature of therapy also leads to behavioral loyalty which further connects a therapy bot to a human user. Despite the  ethical and practical uncertainties about therapy bots, they stand to gain the strongest level of brand equity, regardless of design complexity.

 

Conclusion

Since the boom of chatbots in 2016, the numbers of chatbots online will continue to grow. Chatbots are a low cost and easy to develop technology to build, especially on Facebook Messenger. Advances in NLP and AI have made chatbots more adept at communicating with humans and have broadened the capabilities of chatbots. It is clear that bots can be built to build brand equity at every stage of the CBBE framework. It is also clear that for certain companies, bots can be profitable at the center of the business whereas in other companies, bots are supportive to a larger product offering. Inc.com lists the five industries with the most to gain from chatbots as hospitality, banking, retail, service businesses and publishing. These industries stand to benefit from chatbots in both the productivity of the organization and the level of service that the organization can provide. (Harrison, Kate L). However, chatbots in the healthcare sector are also forecasted to grow because they will be beneficial in those same arenas. Customer service in healthcare is complicated to automate because of the risks that privately held companies have when providing medical advice and services. For simple communication, chatbots have  already proven to be useful for answering questions in the medical field. Therapy bots have found their place within the market and have acquired loyal customers, not unlike ELIZA, the “mother” of chatbots. It is surprising to realize that chatbots have come a long way since 1966, yet still operate the same and still bring forth emotional responses from users.For startup businesses, chatbots are a lean tool that can stand alone or be built upon for other more complex businesses. Regardless of the industry, designing chatbots should be done with the brand equity in mind to limit the negative impacts of a pert bot and heighten the brand’s salience in an ever-changing online marketplace. It is not unlikely that chatbots will one day be part of the foundations of a brand’s identity; the same thoughtfulness that goes into the other aspects of a brand’s identity need to be considered when designing a branded chatbot.

References:

Arthur, Rachel. “Tommy Hilfiger Launches Chatbot On Facebook Messenger To Tie To Gigi Hadid Collection.” Forbes, Forbes Magazine, 12 Sept. 2016, www.forbes.com/sites/rachelarthur/2016/09/11/tommy-hilfiger-launches-chatbot-on-facebook-messenger-to-tie-to-gigi-hadid-collection/#3a5a42ab2238.

“Artificial Linguistic Internet Computer Entity.” Wikipedia, Wikimedia Foundation, 20 Apr. 2019, en.wikipedia.org/wiki/Artificial_Linguistic_Internet_Computer_Entity.

Bozic J., Wotawa F. (2018) Security Testing for Chatbots. In: Medina-Bulo I., Merayo M., Hierons R. (eds) Testing Software and Systems. ICTSS 2018. Lecture Notes in Computer Science, vol 11146. Springer, Cham

Brandtzaeg P.B., Følstad A. (2017) Why People Use Chatbots. In: Kompatsiaris I. et al. (eds) Internet Science. INSCI 2017. Lecture Notes in Computer Science, vol 10673. Springer, Cham

“Building Brand Equity.” Forbes, Forbes Magazine, 24 July 2017, www.forbes.com/sites/propointgraphics/2017/07/08/building-brand-equity/#df793e6e8f85.

Caffyn, Grace, et al. “Two Months in: How the 1-800 Flowers Facebook Bot Is Working Out.” Digiday, 29 June 2016, digiday.com/marketing/two-months-1-800-flowers-facebook-bot-working/.

“Chatbot.” Wikipedia, Wikimedia Foundation, 2 May 2019, en.wikipedia.org/wiki/Chatbot.

Constine, Josh. “Facebook Launches Messenger Platform with Chatbots – TechCrunch.” TechCrunch, TechCrunch, 12 Apr. 2016, techcrunch.com/2016/04/12/agents-on-messenger/.

Constine, Josh, and Sarah Perez. “Facebook Messenger Now Allows Payments in Its 30,000 Chat Bots – TechCrunch.” TechCrunch, TechCrunch, 12 Sept. 2016, techcrunch.com/2016/09/12/messenger-bot-payments/.

Dale, Robert. “Industry Watch The Return of the Chatbots.” Cambridge University Press, 10 Aug. 2016.

Gonzalez, Robert T., and George Dvorsky. “A Chatbot Has ‘Passed’ The Turing Test For The First Time.” io9, io9, 16 Dec. 2015, io9.gizmodo.com/a-chatbot-has-passed-the-turing-test-for-the-first-ti-1587834715.

Harrison, Kate L. “These 5 Industries Have the Most to Gain from Chatbots.” Inc.com, Inc., 9 Oct. 2017, www.inc.com/kate-l-harrison/these-5-industries-have-most-to-gain-from-chatbots.html.

“How Secure Are Chatbots?” Chatbots Magazine, Chatbots Magazine, 23 Jan. 2017, chatbotsmagazine.com/how-secure-are-chatbots-2a76f115618d.

Intelligence, Business Insider. “80% Of Businesses Want Chatbots by 2020.” Business Insider, Business Insider, 14 Dec. 2016, www.businessinsider.com/80-of-businesses-want-chatbots-by-2020-2016-12.

Fernandes, Anush. “NLP, NLU, NLG and How Chatbots Work.” Chatbots Life, Chatbots Life, 15 Nov. 2017, chatbotslife.com/nlp-nlu-nlg-and-how-chatbots-work-dd7861dfc9df.

Kelleher, Katy. “Can Instagram Save the Flower Industry?” Observer, Observer, 2 Oct. 2018, observer.com/2018/10/can-instagram-save-the-flower-industry/

Keller, Kevin Lane. Building customer-based brand equity: A blueprint for creating strong brands. Cambridge, MA: Marketing Science Institute, 2001.

“Keller’s Brand Equity ModelBuilding a Powerful Brand.” Strategy Tools From MindTools.com, www.mindtools.com/pages/article/keller-brand-equity-model.htm.

Lola.com. “NLP vs. NLU: What’s the Difference?” Medium, Medium, 5 Oct. 2016, medium.com/@lola.com/nlp-vs-nlu-whats-the-difference-d91c06780992.

Mikael-Debass, Milena. “Will Chatbots Replace Therapists? We Tested It out.” VICE News, VICE News, 17 Dec. 2018, news.vice.com/en_us/article/nep53m/will-chatbots-replace-therapists-we-tested-it-out.

Moore, Susan. “Gartner Says 25 Percent of Customer Service Operations Will Use Virtual Customer Assistants by 2020.” Gartner, Feb. 2018, www.gartner.com/en/newsroom/press-releases/2018-02-19-gartner-says-25-percent-of-customer-service-operations-will-use-virtual-customer-assistants-by-2020.

“Natural Language Processing (NLP) & Why Chatbots Need It.” Chatbots Magazine, Chatbots Magazine, 25 May 2018, chatbotsmagazine.com/natural-language-processing-nlp-why-chatbots-need-it-a9d98f30ab13.

Nguyen, Mai-Hanh. “The Latest Market Research, Trends & Landscape in the Growing AI Chatbot Industry.” Business Insider, Business Insider, 20 Oct. 2017, www.businessinsider.com/chatbot-market-stats-trends-size-ecosystem-research-2017-10.

O’Hear, Steve. “Cleo, the Chatbot That Wants to Replace Your Banking Apps, Has Stealthily Entered the U.S. – TechCrunch.” TechCrunch, TechCrunch, 20 Mar. 2018, techcrunch.com/2018/03/20/cleo-across-the-pond/.

Raj-Building Chatbots with Python-Using Natural Language Processing and Machine Learning (2019).pdf

Suchman, Lucy A. Plans and Situated Actions: The problem of human-machine communication (Cambridge University Press, 1987)

Sentance, Rebecca. “Cleo, a Chatbot Case Study: Why Brands Need to Be Cautious with Comedy Personas – Econsultancy.” Econsultancy, 22 Feb. 2019, econsultancy.com/cleo-chatbot-financial-services-persona-marketing/.

Stambor, Zak. “How 1-800-Flowers Attracts Millennials.” Digital Commerce 360, 3 Mar. 2017, www.digitalcommerce360.com/2016/03/03/how-1-800-flowers-attracts-millennials/.

Victor, Daniel. “Microsoft Created a Twitter Bot to Learn From Users. It Quickly Became a Racist Jerk.” The New York Times, The New York Times, 24 Mar. 2016, www.nytimes.com/2016/03/25/technology/microsoft-created-a-twitter-bot-to-learn-from-users-it-quickly-became-a-racist-jerk.html.

Wertz, Jia. “Why Chatbots Could Be The Secret Weapon To Elevate Your Customer Experience.” Forbes, Forbes Magazine, 18 Jan. 2019, www.forbes.com/sites/jiawertz/2018/12/23/why-chatbots-could-be-the-secret-weapon-to-elevate-your-customer-experience/#1b17ea384645

Weizenbaum, Joseph. Computer Power and Human Reason (New York: Freeman, 1976)

Weizenbaum, Joseph. “ELIZA – A Computer Program for the Study of Natural Language Communication between Man and Machine,” Communications of the Association for Computing Machinery 9 (1966): 36-45.

 

From AI to Straight A’s: Artificial Intelligence Within Education

Zachery Omer

Abstract

As U.S. students’ test scores in math, reading, and science continue to fall in the middle of pack worldwide, it’s become apparent that a change to our education system is necessary. I believe that the system could be bolstered by recent advancements in artificial intelligence technology, such as automation and adaptive learning, including gamification and knowledge monitoring. I also think that the implementation of these technologies could lead to significant changes in the profession of teaching, such as the methods, curriculum, environment, and materials needed to do the job effectively. This essay will pursue the research question: How can these AI technologies best be implemented and integrated into our education system without putting too much pressure on teachers, students, or the technologies themselves? I will explore the writings and musings of professionals and thought leaders across the fields of technology and education, as well as anecdotal evidence and several case studies from the past few years. 

Introduction

When I think back on my own educational experience, I can distinctly remember the impact of technology as the years passed. In early elementary school, when a television would be rolled into the classroom on a cart and we realized that we’d get to watch a VHS episode of Bill Nye the Science Guy or Reading Rainbow with Levar Burton, the excitement was palpable.

By 5th grade, we had all survived Y2K, and the digital revolution had officially begun. Our class had a small set of Alphasmart 2000 keyboard devices so we could begin learning how to type. In 6th grade, our classroom was the site of the school’s first SMART Board, and there was a bulky desktop computer for every 2-3 students. For a small old school building in rural Missouri that didn’t even have air conditioning in every room, this felt very cutting edge, futuristic, and exciting.

Throughout middle and high school, there were still plenty of dry erase markers and overhead projectors, but SMART Boards became more and more common, and course materials became increasingly digital. First it was correspondence: emails, syllabi, grades, etc., but eventually class notes, presentations, modules, and assignments moved online as well. I got my first smartphone for my 18th birthday, and my first laptop as a high school graduation gift.

In college I became familiar with a learning management system (LMS) for the first time, using Blackboard as a central location for assignments, blog posts, course materials, grades, and more for almost all of my classes. This LMS experience had mixed results, largely depending on the professor using the system and their relationship with technology. Nearly all assignments, quizzes, and essays were submitted electronically, and I even took several courses that were entirely online, where I never physically met my professor or classmates.

Upon graduation, I began working at a public high school, and found that the technological environment had continued it drastic development. Every classroom had a SMART Board, most classes had their own set of Google Chromebook laptops, and students were allowed– although not encouraged– to keep their personal smartphone devices out in plain view on their desk. As I observed students listening to music, texting, playing games, taking photos, and browsing social media throughout the school day, I couldn’t help but think of my own high school experience (only 5 years prior), where if you were caught with your phone out during class, it was taken for the remainder of the day.  Nearly all of my coworkers at the school agreed that these devices were a distraction, and many had different methods of trying to govern or regulate their use during class, but very few were willing to endure the inevitable revolt that would accompany the outright banning of phones in class. Simply put, most students expected perpetual connectivity through their smartphones, and depriving them of that, even for a few hours, led to feelings of isolation and irritability.

In only 20 years, the education system underwent a colossal change, on a scale that has likely never been seen before. In parallel with society, it was a shift from primarily analog activity to almost exclusively digital. Along with that shift came massive changes in pedagogy and even epistemology. As we look forward from our current point in time, the possibilities appear endless for technology within education. Artificial intelligence, automation, and machine learning have become quite the buzzwords across all industries, and education is no exception. Can artificial intelligence help to produce real intelligence in the classroom? Can deep-learning algorithms produce deep-learning students? How can these technologies best be implemented and integrated into our education system without putting too much pressure on teachers, students, or the technologies themselves?  

According to Pew Research data, students in the United States rank near the middle of the pack in math, science, and reading, and are below many other industrialized nations in those categories. According to the 2015 study, among 71 participating countries, the US ranked 38th in math and 24th in science (Desilver, 2017).

In an attempt to help counter this disappointing educational mediocrity, I have researched several different aspects of AI and machine learning to discern how these readily available technologies could be utilized effectively in schools. Over the course of this essay, I will explore the potential role(s) of artificial intelligence in our education system, and discuss the changing role of educators alongside these new AI technologies, to effectively prepare and equip our students (and teachers) for the inevitable advancement of the Digital Age.   

 

Section 1: The Role of Artificial Intelligence in Education

While artificial intelligence seems like a product of the 21st century, the concept was actually conceived back in 1936 by Alan Turing and the term was coined in 1956 at Dartmouth University (Computer History Museum, 2014). Since its ideation, AI has undergone multiple cycles of hope, hype, and hysteria, where people marvel at its potential, get excited at its release, and become concerned that it will somehow destroy us. The terms “artificial intelligence” and “AI” have been eagerly– and broadly– adopted by companies and media outlets without realizing the full meaning behind them, causing a rift in the public’s understanding of these technologies. According to Margaret Boden (2016) in her book AI: It’s Nature and Future, “Intelligence isn’t a single dimension, but a richly structured space of diverse information-processing capacities. Accordingly, AI uses many different techniques, addressing many different tasks” (p. 12). For the purposes of this essay, I will focus on the areas of automation and adaptive learning within artificial intelligence, and how those concepts may be applied to the field of education.

  • Automation

Schools have been already begun implementing automation in several capacities, such as machine-graded Scantron tests and automated class registration, but further potential applications are vast. Automation can fast-track many of the tedious, repetitive, paper-heavy administrative tasks that are necessary for the system but have burdened educators for ages, such as “creating class schedules, keeping student attendance, processing grades and report cards, as well as helping to admit new students” (Ostdick, 2016). School support staff can also benefit from automation. Librarians, for example, are utilizing specialized search portals, streamlined shelving navigation, and automated self-checkout more commonly; this frees the staff from “repetitive and low-value tasks so they can help students with more educational inquiries, while giving students more autonomy through technology” (Kinson, 2018).

For teachers, with routine tasks like grading, attendance, and scheduling being further outsourced to automation technologies, more time will be available to concentrate on relationship-building with students and pedagogical strategy. Automated grading software can already handle multiple choice assignments and exams, and most fill-in-the-blank exercises, but with advancements in natural language processing the practice of essay grading will also soon be in the hands of artificially intelligent software (TeachThought Staff, 2018). One example of this type of software is Gradescope, an AI-based grading system already used by universities like Stanford and UC-Berkeley (Rdt, 2018). Simon Rdt of Luminovo AI, an affiliate of Medium, writes about the effectiveness of these automated essay scoring (AES) programs:

One approach to AES is finding objective measures such as the word length, the number of spelling mistakes, and the ratio of upper case to lower case letters. However, these obvious and quantifiable measures are not insightful for evaluating crucial aspects of an essay such as the argument strength or conclusiveness. (Rdt, 2018).

Despite this glaring flaw, back in 2012 when these types of technologies were first being introduced, the William and Flora Hewlett Foundation organized a competition to compare the grading of AES programs and real teachers. According to Rdt (2018), “the output of the winning team was in 81% agreement with the teachers’ gradings, an impressive result that marked a turning point in teachers’ perceptions towards education technology.”

With this kind of technological assistance, students will no longer need to wait days or weeks to receive grades and feedback on their work; instead, this will be done within moments of submitting. Advanced progress monitoring will allow for faster identification of gaps in the class material and the need for more focused personal intervention (Ostdick, 2016). This opens the door for a more individualized learning experience for students, and a more reflective and purposeful teaching experience for educators.

Furthermore, the potential of natural language processing (NLP) within education can go far beyond just grading essays. Automated virtual assistants, such as Alexa and Siri, use NLP to receive spoken commands and questions, and to react accordingly, often serving as a convenient and efficient source of knowledge, information and feedback. These types of technology could be extremely useful in orally centered educational environments, such as speech pathology and foreign language courses. Several schools, such as Saint Louis University, have even begun installing specialized Amazon Echo devices equipped with Alexa in campus dormitories and other living spaces (Saint Louis University, 2018).

These types of automated technology will ideally help to cut down on educational bureaucracy and free up time for more creative and engaging instruction, and a more autonomous learning experience for students. In doing so, automation will also help to appease the insatiable desire for instant gratification that has been fostered by the immediacy of the Digital Age.

  • Adaptive Learning

In the same way that social media, online shopping sites, and media streaming platforms can observe our behavior and cater to our interests, preferences, and abilities, so too could educational materials. Perhaps the most impactful implementation of artificial intelligence in education will come in the form of adaptive learning, specifically in the areas of gamification and knowledge monitoring. One head of product management at Google expects that AI adaptive learning will lead to personalized instruction for students “by suggesting individual learning objectives, selecting instructional approaches and displaying exercises that are based on the interests and skill level of every student” (Rdt, 2018).

Some of the earliest conceptualizations of adaptive learning stemmed from the notion of cybernetics and the work of Warren McCulloch. According to Boden (2012), McCulloch’s “knowledge of neurology as well as logic made him an inspiring leader in the budding cybernetics movement of the 1940s” (p. 13). Boden goes on to explain the primary themes of the field of cybernetics:

Their central concept was “circular causation,” or feedback. And a key concern was teleology, or purposiveness. These ideas were closely related, for feedback depended on goal differences: the current distance from the goal was used to guide the next step. (Boden, 2012, p. 13)

As evidenced by the above quote, these ideas of cybernetics are imperative to the implementation of adaptive learning in schools. Using continuous feedback to guide the student toward desired learning goals can be achieved fairly easy through artificial intelligence software. As such, a crucial element of this process is identifying the student’s Zone of Proximal Development (ZPD), which is the cognitive area “between a student’s comfort zone and their frustration zone. It’s the area where students are not repeating material they’ve already mastered nor challenging themselves at a level so challenging that they become frustrated, discouraged, and reluctant to keep learning” (Lynch, 2017). Progress often occurs at the edge of our comfort zones, and so by effectively maximizing ZPD, adaptive learning programs will better prepare students to master the course material and develop creative critical problem solving abilities that can benefit them inside and outside of the classroom (Lynch, 2018). This process could also feature an element of “scaffolding,” where the educator (and/or AI program) “gives aid to the student in her/his ZPD as necessary, and tapers off this aid as it becomes unnecessary, much as a scaffold is removed from a building during construction” (Culatta, 2011). Module-based, goal-oriented online education programs often utilize this method.

Gamification has been a popular concept in education for years, and has seen mixed results. Researchers who have studied this method have found that “The use of educational games represents a shift from ‘learning by listening’ to ‘learning by doing’ model of teaching. These educational games offer different ways for representing complex themes” (Peixoto et al., 2018, p. 158). However, with the exception of a few early-learning games that are cartoon-ified and fun (like Schoolhouse Rock, which I loved as a child and still remember playing to this day), many module-based online learning platforms fail to engage students or achieve their desired outcomes because the material is still presented as it might be on a worksheet or set of lecture slides.

Why do we stick to this outdated method of delivery in the classroom, when the games that kids are playing at home (or in their pockets) are infinitely more fun and engaging? For example, the Assassin’s Creed franchise has been around for over a decade, and while the actual objectives can be a bit dark and gory, the entire premise of the game is traveling back to different (historically accurate) time periods and exploring their cities and culture to solve mysteries and track down your targets. Each game covers a different historical era, such as the Ottoman Empire, Ancient Rome, Industrial London, the Italian Renaissance, Revolutionary America, and many more. Simply by playing these games and achieving their objectives, users can gain a deeper understanding for the culture, architecture, clothes, events, and main characters of these important time periods in our world’s history.

In theory, similar styles of games could be adapted for classroom use, although research and development for educational implementation has been relatively minimal thus far. These types of games would utilize elements such as quest/saga-based narratives, continuous performance feedback, instant gratification in the form of progress-tracking and/or rewards systems, objective-based progression, and even an adaptive CPU that gets harder or easier based on a student’s performance. All of these criteria could be met while still delivering the stunning graphics, dynamic gameplay, customizable features, and even theatrical cuts that users have come to expect.

With that said, studies have been conducted into the different effects of gamified learning within education, including potential negative effects. Toda et al. (2018) completed one such experiment, and identified four negative effects of gamification: indifference, loss of performance, undesired behavior, and declining effects (p. 150). Most of these effects are closely related; for example, the main difference between loss of performance and declining effects were the factors of motivation and engagement, and they found that declining effects often to led to loss of performance (Toda et al., 2018, p. 151). A few common aspects of gamification they found to be particularly problematic and contributed to these negative effects were leaderboards, badges, points, and levels (p. 152). The researchers noted that most of these negative effects could be remedied with more efficient game design and instruction (p. 153).  

While the notion of gamified learning is often met with resistance or labeled as “edu-tainment,” we must face the fact that we now live– and modern students were raised from birth– in a society that completely revolves around entertainment. Our phones are always buzzing, social media feeds are always scrolling, TVs are always flashing in the background, headphones are always in, sporting events and other ceremonies are always being covered, and the current President of the United States is a former reality television star. That pervasive entertainment-based lifestyle of perpetual stimulation isn’t the healthiest option for anyone’s brain, let alone a developing child’s, but to completely exclude these modern technologies and platforms from the classroom creates a foreign environment of regressive isolation and uncomfortable disconnection for students. Of course a balance must be struck between screen-based learning and interpersonal interaction, but at the moment the screen-based learning being implemented is often inefficient and and disengaging for students. If we could responsibly harness the technology that drives the rest of our daily entertainment wants and needs, we may see those aforementioned mediocre educational rankings for the United States begin to rise.

Knowledge monitoring is another key to adaptive learning technologies. While the AI-powered program will track progress and skills gained, and the educator will track comprehensive retention and practical application, the student must also be aware and responsible for their own knowledge monitoring as well. The rationale is fairly straightforward:

If students fail to differentiate what they know or have learned previously from what they do not know or need to learn (or relearn), they are not expected to engage more advanced metacognitive strategies, such as evaluating their learning in an instructional setting, or employing more efficient learning and studying strategies. (Tobias & Everson, 2009)

According to a study by Kautzmann & Jaques (2018), effective knowledge monitoring provides long-term metacognitive benefits to students throughout their academic careers by helping them become self-regulated learners. These researchers claim that “Self-regulated learners are proactive in setting goals, planning and deploying study strategies, monitoring the effectiveness of their actions and taking control of their learning and more prone to achievement in college” (Kautzmann & Jaques, 2018, p. 124). Therefore, it is vital that knowledge monitoring take place throughout the AI-supported learning process, by the adaptive learning program, by the educator, and– perhaps most importantly– by the students themselves, to ensure that the material being taught is understood, contextualized, and applied in a practical way.

Section 2: The Role of Educators Alongside Artificial Intelligence

If these AI-powered technologies do integrate within the education system and catalyze impactful change on a large scale, then the role of educators will need to adjust accordingly. While certain elements of AI provide the potential to make the lives of teachers easier, these adjustments will spread across nearly every aspect of the teaching profession, including methods, curriculum, class environment, and materials. This will not be a seamless transition; in the current stage of the Digital Age, many teachers struggle to keep up with the necessary trainings for new technology implementation. According to Daniel Stapp, a current high school teacher:

The level of expectation placed on teachers is ridiculous…I teach five classes of 35 kids who are always writing essays, and the expectation of my school is that I’m using TurnItIn.com. But they gave us training on [TurnItIn] on one of our Grading Days, where we have contracted time to grade…it was ‘volunteer training.’ (Stapp, 2018)

As the development of these new types of technologies advances and becomes more accessible, more time and resources will need to be dedicated to training teachers in these new platforms and programs. By doing so, changes to the roles of the profession will be more readily acquiesced.   

  • Methods

One major change in pedagogy will be a shift from stand and deliver instruction to more of a coaching and facilitation role for educators (Wagner, 2018). As Wagner (2018) writes, “In an information age, with content available with the click of a mouse, teachers must shift from the ‘sage on a stage’ to the ‘guides on the side.’” This will require a stronger focus on the personal needs of students, and further emphasis on contextualized learning, dynamic methods of class engagement, assisted knowledge monitoring, and extra support to students who may be less tech-savvy. This is not to say that teachers will be handing over the reins of their classroom to these technologies and regressing into a supporting role, but they will need to rethink how these technologies can be effectively maximized while still developing positive interpersonal relationships with students, contextualizing knowledge and applying it practically, and providing mentorship along the way. 

  • Curriculum

A teacher’s curriculum will become increasingly dynamic and individualized in the coming years alongside automation and adaptive learning programs. According to Wagner (2018), it will be a transition from developers of content to developers of learning experiences. This would entail multimodal presentation of materials, such as text, audio, and video, in order to connect with students of all different learning styles. This may also include small reflection groups for certain topics, and providing anecdotal examples and supporting evidence for more contextualized learning. Because syllabi are now primarily online and editable, it has become easier to make alterations throughout the course of the semester such as adding or subtracting required readings or assignments based on student (or class) progress.

  • Environment

In the past, the communication of teachers and student’s learning were often confined within the walls of classroom or the textbook provided, but with the advent of the internet and social media, the learning environment has the potential to become more ubiquitous. Wagner (2018) describes it as a shift from siloed classrooms to virtual social networks. Collaborative platforms such as Google Drive are extremely helpful in this regard. Wagner (2018) mentions another platform called Brainly that can connect students with peers to address subject-specific questions. He gives a fun example:

They type in their question on Brainly and are connected to a short narrated video that uses modern day Marvel characters to explain the concept. If they wish to ask follow up questions, they are connected through to the student creator of the video via a chat box. (Wagner, 2018)

  • Materials

Expanding upon the “Curriculum” section, there will be a strong shift among educators from using textbooks and a set curriculum to blended courses and customized class design (Wagner, 2018). Traditional textbooks are often expensive, heavy, and underutilized by the end of the semester. Furthermore, most textbooks lose value exponentially each year when a new edition is released, making their contents outdated and thus difficult to reuse or sell back.  

Blended courses will combine elements of online learning with interpersonal instruction. Several AI-powered platforms have been developed to help teachers with the creation and implementation of these types of courses, such as Content Technologies, Teachable, CourseCraft, and Udemy (Wagner, 2018).

 

Conclusion

I do not mean to assert with this essay that technology is the be-all and end-all solution for education in the United States. One of the most important aspects of a child’s education is the socialization process that accompanies interpersonal interaction at school. Daniel Stapp (2018) elaborates on this phenomenon, and the impact that technology can have:

I think one of the biggest skills people gain from a brick-and-mortar school is interpersonal communication and relationship-building. And I think adding a layer of tech between people to do that sometimes takes away from the power of that connection. And it can add to it too, it just depends on what kind of tech you’re using and how you’re using it. (Stapp, 2018)

These AI technologies, when utilized appropriately and effectively, are intended to supplement educators and their mission, not supplant them. It should not be a process that is rushed into; these types of technological transitions take time and training to implement effectively. There have already been examples of AI-powered learning platforms that have backfired, such as the Summit Learning– a platform developed by Facebook– rollout near Wichita, Kansas and other cities around the US in 2018 (Bowles, 2019). Teachers and students alike were unprepared for the massive changes in pedagogy and curriculum that accompanied the Summit Learning program, as well as the cognitive and physical effects of spending more time in front of screens. As a result, many students protested, and parents began pulling their children from the schools participating in the program (Bowles, 2019).

Further research will need to be conducted into the long-term cognitive and physical effects of these types of AI learning programs. Ideally, educators will be able to find a healthy balance in the classroom between traditional, seminar-based instruction and online, self-guided, screen-based learning. To that end, more research will also need to be conducted into the effects of modality switching on the learning process and comprehension abilities of students.

However, the tremendous potential of automation and adaptive learning for education is too tantalizing to resist. This essay painted with broad strokes in an attempt to cover the education system in general, but there will be certain nuances that accompany the integration of technology at the different levels of schooling, from elementary school up to higher education. With the appropriate amount of training, transition time, and consent, these technologies could be utilized to stimulate incredible change in the American school system, and help to re-establish the United States as an educational superpower in the world.

Images Used

“AV Trolley” by mikecogh is licensed under CC BY-SA 2.0

“AS3K and Neo” by nathanww is licensed under CC BY-SA 2.0

“Reflected light Smartboard” by touring_fishman is licensed under CC BY-NC-SA 2.0

DS-UNIVAULT-30 Chromebook Cart sourced from www.ipadcarts.com

Zone of Proximal Development” sourced from (Culatta, 2011)

“SchoolHouse Rock!” sourced from CDAccess.com

Assassin’s Creed Anthology” sourced from Reddit.com

References

Boden, M. A. (2016). AI: its nature and future (First edition). Oxford, United Kingdom: Oxford University Press.

Bowles, N. (2019, April 21). Silicon Valley Came to Kansas Schools. That Started a Rebellion. NY Times. Retrieved from https://www.nytimes.com/2019/04/21/technology/silicon-valley-kansas-schools.html

Computer History Museum. (2014). Artificial Intelligence. Retrieved from https://www.youtube.com/watch?v=NGZx5GAUPys&list=PLQsxaNhYv8dbK3yMHXk35jtZFdu7o46gu&index=5

Culatta, R. (2011). Zone of proximal development. Retrieved from Innovative Learning website: http://www.innovativelearning.com/educational_psychology/development/zone-of-proximal-development.html

Desilver, D. (2017, February 15). U.S. students’ academic achievement still lags that of their peers in many other countries. Pew Research Center. Retrieved from https://www.pewresearch.org/fact-tank/2017/02/15/u-s-students-internationally-math-science/

Kautzmann, T. R., & Jaques, P. A. (2018). Improving the Metacognitive Ability of Knowledge Monitoring in Computer Learning Systems. In Higher Education for All. From Challenges to Novel Technology-Enhanced Solutions (Vol. 832, pp. 124–140). https://doi.org/10.1007/978-3-319-97934-2_8

Kinson, N. (2018, October 24). How automation will impac education. Retrieved from Beta News website: https://betanews.com/2018/10/24/how-automation-will-impact-education/

Levesque, E. M. (2018). The role of AI in education and the changing US workforce. Retrieved from Brookings Institution website: https://www.brookings.edu/research/the-role-of-ai-in-education-and-the-changing-u-s-workforce/

Lynch, M. (2017, August 7). 5 Things You Should Know About Adaptive Learning. Retrieved from The Tech Advocate website: https://www.thetechedvocate.org/5-things-know-adaptive-learning/

Ostdick, N. (2016, December 15). Teach Me: Automation’s Role in Education. Retrieved from UI Path website: https://www.uipath.com/blog/teach-me-automations-role-in-education

Peixoto, D. C. C., Resende, R. F., & Pádua, C. I. P. S. (2018). An Experience with Software Engineering Education Using a Software Process Improvement Game. In Higher Education for All. From Challenges to Novel Technology-Enhanced Solutions (Vol. 832, pp. 157–173). https://doi.org/10.1007/978-3-319-97934-2_10

Saint Louis University. (2018, August). SLU Installing Amazon Alexa-Enabled Devices in Every Student Living Space on Campus. Retrieved from SLU Alexa Project web page: https://www.slu.edu/news/2018/august/slu-alexa-project.php

Stapp, D. (2018, March). Technology in Secondary Education [Phone].

TeachThought Staff. (2018, September 16). 10 Roles for Artificial Intelligence in Education. Retrieved from TeachThought website: https://www.teachthought.com/the-future-of-learning/10-roles-for-artificial-intelligence-in-education/

Tobias, S., & Everson, H. T. (2009). The importance of knowing what you know: A knowledge monitoring framework for studying metacognition in education. In D. J. Hacker, J. Dunlosky, & A. C. Graesser (Eds.), The educational psychology series. Handbook of metacognition in education (pp. 107-127). New York, NY, US: Routledge/Taylor & Francis Group.

Toda, A. M., Valle, P. H. D., & Isotani, S. (2018). The Dark Side of Gamification: An Overview of Negative Effects of Gamification in Education. In Higher Education for All. From Challenges to Novel Technology-Enhanced Solutions (Vol. 832, pp. 143–156). https://doi.org/10.1007/978-3-319-97934-2_9

Wagner, K. (2018, January 15). A blended environment: The future of AI and education. Getting Smart. Retrieved from http://www.gettingsmart.com/2018/01/a-blended-environment-the-future-of-ai-and-education/

The Real Siri: Past, Present, and Future.

Annaliese Blank

Professor Irvine

CCTP: 607-01

3 May 2019

The Real Siri Unpacked: Past, Present, and Future

Abstract

In today’s pro-tech world, virtual assistants are becoming highly prevalent for at home technology. One assistant in particular, Siri, has transformed the structure and capability of cell phone services for Apple Products and acts as a third party to all of your needs at the touch of a button. The rise and reputation of this technology has inspired further investigation for this research topic. Based off this, a research question to be looked into further would be, ‘Does this type of technology, Siri in particular, pose a serious threat to your personal information, privacy, and overall data ethics’? Some main points to hypothesize would be Siri is a virtual assistant that is always listening to its ‘owner’ or ‘master voice’ and Siri technology is invasive for personal data ownership, usage, and distribution to Apple INC.

One main approach to understanding and answering this question would be to use a socio-historic analysis of Siri in terms of how it evolved, what was the initial goal and purpose, pro’s and con’s to its advancements, and where the technology is headed in terms of re-modeling and further implications for the future. For the analysis, being inclusive as possible, I aim to implement current journal articles on this technology, websites on the history, community blogs for outside opinions, and previous personal blog post insight from our course on virtual assistant technology in general. Having content to unpack from various platforms will be informative and exclusively educational towards de-black-boxing Siri and the future Siri is creating for the world. 

Introduction

In response to my research question, a common reply would be: why does this matter or why should we care? The answer to this question is simple: your privacy is entirely at risk when virtual assistants are being used on a regular basis. For example, if a person is using Siri within their apple device, it acts as a virtual contract that is able to listen and access information about you which is then stored by Apple to benefit their brand and grant ownership to your data. As most would agree, if more research was conducted on the purpose, function, ethics, and effects of Siri, I hypothesize there would be an increased desire for further legal revisions or privacy regulations for the future.

In this paper, I address the origins of Siri, how it was refined, projected in the media commercials, the mechanics to perform its operations, algorithmic functions, cloud connectivity, natural language processing, understanding the connection to the deep neural network, and voice control wake word controversy as the technology continually updates in newer versions. All of these features were dense to de-black-box but more can be understood when we combine all of these components in a socio-technical lens with the combination of its history and how it all came together. Future implications are necessary if we do not want virtual assistants to slowly start governing and controlling our lives in terms of day to day tasks. From the literature I gathered, it is evident that other researchers would agree with my claims and provide support that this technology would indeed pose a serious threat to our personal data, usage, and listening when we are not fully aware of it. It poses an unethical standpoint on its purpose and integration in our lives. Further legal revisions and implications would need to be implemented if we chose to not let artificial intelligence replace human discourse and interface. 

 

Literature Review

When it comes to Apple products, most people would assume Apple masterminds are behind patenting their technology and features independently. But for Siri, this was actually not the case. The idea of Siri technology dates back to the early 1980’s and was fully implemented in Apple near 2003, and wasn’t fully installed until their recent iPhone 6S model which was released in 2013-2014 (Cult of Mac, 2018, pg. 1). During this time, the 80’s were all about technology, innovation, and industrial work. In the 1980’s it was ironically mentioned as a neat feature they would eventually like to acquire or wish they were able to (Cult of Mac, 2018, pg. 1). Another unknown fact was when this idea became rolling, Steve Jobs put most of his final efforts into the Siri feature before his passing, which Apple holds most-high in his honor. Some say this was his final gift in really improving the way cell phones were intended to be used. 

As technology was advancing, so was our military and national security. Near 2003, The Defense Advanced Research Projects Agency, formally known as DARPA, “began working on an AI assistant project that would help military commanders deal with the overwhelming amount of data they received on a daily basis” (Cult of Mac, 2018, pg. 1). In order for this pro-type idea to see if it was possible, the DARPA reached out the “Stanford Research Institute for further input and testimonial research” (Cult of Mac, 2018, pg. 1). The SRI decided to jump onboard with their proposal and test further ideas to see what else this platform would be able to handle. This is where the classic ‘Siri’ name is rooted from. Many people do not know this is how and where the technology originated. “The SRI decided to create a spin-off called Siri, a phonetic version of their company name, and was launched in the app store in 2010.” (Cult of Mac, 2018, pg. 1). The most interesting part to this history of Siri was it was developed as its own separate app and unattached to anything else. It was designed originally to tackle multiple things such as, “Order taxis through Taxi Magic (which was the original form of Uber in 2010), pull concert data from StubHub, movie reviews from Rotten Tomatoes, or restaurant reviews from Yelp” (Cult of Mac, 2018, pg. 1).

As impressive as this prototype was, Apple decided to buy out the app and partner with the SRI for a “200 million-dollar deal” (Cult of Mac, 2018, pg. 1). that would soon change the game for Apple iPhones forever. According to Apple INC AI specialist, Scott Forstall, “Apple decided to partner with the same groups SRI did, including Wolfram Alpha, Wikipedia, Yelp, Weather Apps, Yahoo, NASDAQ, Dow, and local news stations for traffic and current time zones, and many more. Scott makes it very clear that Apple wanted this feature to be as accurate, fast, and hands-free as possible, also with a virtual voice that was friendly, respectful, and reliable” (Cult of Mac, 2018, pg. 1). In doing this, it pro-creates a positive Apple user experience and makes the user feel confident in the phone’s capability. This was their initial goal and they prioritized making this the most innovative phone technology to date. 

During this acquisition process, Apple’s first version of Siri once purchased was unable to originally speak back to the user. It was originally used to provide answers through the other brands it partnered with to provide quick help or feedback on a question. In order to improve this, what’s called as Natural Language Processing, or NLP, was implemented in the newer version of Siri to allow the verbal connection to the technology and have Siri fully understand the words, connotation, voice pitch, voice que’s, and pronunciation of what the user is saying or asking itself. In other words, Siri will be able to understand what the user is saying, and voice the correct response back in the correct style, language, and structure. This was the ideal model for the iPhone 4S that was able to perform Speech Recognition and received a large amount of media attention when it was released. Further Speech Recognition ideology will be explained later below when the mechanics are unpacked. 

The first version of Siri in the iPhone 4S provided sample questions in their first commercial showing real world and hands-free scenarios asking tasks to be done such as: 

‘Can you reschedule my meeting to 5pm, What is the weather in New York looking like for this weekend, Call Mom, Text back I will be there soon, What time is it in Paris, What is the currency for 60 Euros in dollars, etc’. Types of questions like these require the applications they have partnered with to provide the fastest and most accurate responses in a matter of seconds, such as Wolfram Alpha, Yelp, Wiki, and many more. The reaction to this new feature was so positive and popular, Apple then created this feature accessible on their other devices such as their computer, Mac and now iPad technology (Cult of Mac, 2018, pg. 1). Siri is the fast, accessible medium between the user and all of the other apps it has partnered with to be the first platform necessary to answer these types of questions or tasks. 

As this 200 million-dollar deal has progressed, many capabilities to this component have now been researched further in terms of its’ NLP processing, data collection, and personalization functions in newer iPhones today. As every year passes, Apple is known for bettering each device they produce by making faster and smarter improvements in their mechanics and AI functions in their products. For their iPhones, this is priority since their phone models have dominated the cell phone industry since the first Siri feature. In recent development, the progression of Siri has now become gendered, more accurate, and geographically diverse. 

As Siri continued to progress in each newer model of the iPhone, the voice responding back to the user eventually became more noticeably female rather than male. This is very controversial for virtual assistants in general since they are artificial it can be difficult to create a voice that isn’t one gender or the other. This feature did receive some backlash in terms of a female voice in earlier models, but Apple has now created a feature to Siri that doesn’t have to be a She; it can be a He or an It. The problem with this dates simply back to earlier stages of 1950’s gender roles and norms when women were “Ready to answer serious inquiries and deflect ridiculous ones. Though they lack bodies, they embody what we think of when we picture a personal assistant: a component, efficient, and reliable woman. She gets you to meetings on time with reminders and directions, serves up a reading material for commute, and delivers relevant information on the way, like weather and traffic, etc.” (The Real Reason, 2018, pg.1). Apple released a statement in 2013 now saying “both options are available for voice preference” (The Real Reason, 2018, pg.1). Small changes like this allow Apple to create room for improvement to present Siri as the best virtual assistant on the market that is not only smart technology but is customizable per each user. Such categories would be voice volume, gender voice, accent, and notification preferences. 

As if Siri cannot be any more of a personal experience and technology, Apple’s newer feature of cloud computing and cloud capability has transformed Siri even further than before. The feature of Cloud Computing was originally released in 2011 (Apple Privacy, 2019, pg.1). Apple’s Press Release Statement thus follows, “Apple today introduced iCloud as a breakthrough set of free new cloud services that work seamlessly with applications on your iPhone, iPad, iPod touch, Mac, or PC to automatically and wirelessly store your content in iCloud and automatically and wirelessly push it to all your devices. When anything changes on one of your devices, all of your devices are wirelessly instantly” (Apple Privacy, 2019, pg.1).

Some operations of the cloud include, “cloud computing, cloud storage, cloud backups, or access to photos, documents, files, contacts, reminders, music, etc.” (Apple Privacy, 2019, pg.1). In terms to Siri, it is able to perform NLP through cloud computing and virtually store your data the more you use the device. As stated in the press release, the key here is virtual, wireless, automatic service which is easily accessible via the Siri component. Some tasks to ask Siri on newer models of the iPhone would be, “Siri, can you save my email in the cloud, can you add my song to my playlist, or can you save these documents in my work folder, Send this to the cloud, etc.” 

A common question in response to this would be: How does the architecture of NLP and Cloud Computing work? In order for Siri to be used correctly, it first must be used from your device, have wireless connection to use other platforms, and access to the cloud feature in order to store your data to process the important information get to know the user better. When this happens, your data becomes personalized, which is then stored away virtually. The cloud component is needed for your user profile to be understood and processed. 

The next question would be: How does Siri actually listen to the user in order to then be able to function with the Cloud? What are the mechanics that make all of this possible? The answer to this question is complex, but it can be de-black-boxed. From a historic standpoint, Apple improved this technology by allowing it to perform speech recognition using speech patterns and sound waves which can be computed and understood through NLP and Cloud computing and then sent back to the user. 

There is a constant signal sent back and forth in order for Siri to hear, understand, save, and respond to you. All of this is done in seconds; sometimes milliseconds, depending on the complexity of your request. As mentioned previously, a hands-free experience is priority, so when we de-black-box cloud computing, the required mechanics for Siri include the ability to perform text-to-speech and speech-to-text recognition and access to the DNN. All of this is done through layers of the deep neural network which is explained in the next step below.

 

From the design standpoint, there are many designs and layers to Siri that must be understood. Once you have asked Siri a question with the button, or asked “Hey Siri”, there are signals being sent via cloud computing and the deep neural network that record your questions and determine the correct answer, which is then recorded in text, and presented back to the user by voice. To make this as clear and simple as possible, according to Apple’s own Siri Team site, they said, “The ‘Hey Siri’ feature allows users to invoke Siri hands-free. A very small speech recognizer runs all the time and listens for just those two words. When it detects “Hey Siri”, the rest of Siri parses the following speech as a command or query. The “Hey Siri” detector uses a Deep Neural Network (DNN) to convert the acoustic pattern of your voice at each instant into a probability distribution over speech sounds. It then uses a temporal integration process to compute a confidence score that the phrase you uttered was “Hey Siri”. If the score is high enough, Siri wakes up” (Apple Hey Siri, 2019, pg.1). Here is a picture below to visualize the layers and further understand where the speech waves travel.

For newer updates of virtual assistants to fulfill the hands-free experience, the wake word is required. For Siri, as mentioned before, it is now in newer models known as, ‘Hey Siri’. Siri must be turned on in the settings of your phone in order to be always listening and awaiting your attempt to wake it with the wake word. To address the second part of this question, the mechanics that make all of this understood and possible happens within what’s called, Speech Synthesis. This is a very interesting layer to Siri. Speech synthesis is the layer that is able to then understand and voice back to the user the proper response once the initial question was heard, understood, and processed through NLP and Cloud Computing (Apple Siri Voices, 2019, pg.1). 

According to Apple’s Siri Team site, they say, “Starting in iOS 10 and continuing with new features in iOS 11, we base Siri voices on deep learning. The resulting voices are the more natural, smoother, and allow Siri’s personality to shine though” (Apple Hey Siri, 2019, pg.1). In the picture below, provides a clear representation of how text-to-speech synthesis looks, and operates. Starting from the left, text is used as the input, text analysis occurs, then followed by the prosody model, which deals with rhythm, and then signal processing begins with unit selection and wave from concatenation, which deals with the sequence of chain of code deliverable back in speech form. To be clear, this is where the predictive feature can be further explained once it goes through each of these units in the model (Apple Hey Siri, 2019, pg.1).

 

This is great example of how the user is able to speak to Siri, and how Siri is able to respond to the user and get to know them through this process of machine learning and deep neural networks with NLP discourse. All of this virtual assistant process is entirely possible with help from NLP, the deep neural network, speech recognition ability, machine learning and algorithmic implementation, and speech synthesis, and many more complex features.  (insert other pic here)

Literature Continued: Pro’s and Con’s 

With the general socio-technical history, mechanics, and layers to Siri understood, this is where my original research question initially began since Siri is posed as complex and positive technology, but one must question what are the negatives within the positives? With progress in any form of technology, there are always some drawbacks since nothing is deemed perfect. The entire process of Siri is non-visible and a lot is going on that most users are not aware of. With this, it is important to lay out the framework for unpacking the pro’s vs. the con’s to Siri. Another question to also address would be, what other things can Siri do? 

Some pro’s to Siri technology would be, “Siri can act as a personal scribe, she can write texts for you, post to your social media accounts, solve complex math equations, finding emails, and converting measurements, even Morse code” (UK Norton, 2018, pg.1). Other things include, “booking your evening out for you with certain apps, like food apps, or Yelp for food reviews. Siri can also be used for Open-Table and automatically book your reservation” (UK Norton, 2018, pg.1). As mentioned before, in the newer software updates such as iOS 8 or iOS 9, the “Hey Siri feature must be turned on and you can accomplish any task with Siri” if you start your interaction with Hey Siri as the wake word (UK Norton, 2018, pg.1). 

Some con’s to Siri technology would be, “Siri has listening problems, is always listening to you if turned on, or if your Wi-Fi dies, Siri dies with it” (UK Norton, 2018, pg.1). When we say listening problems, sometimes if your question is too complex, Siri might not be able to fully understand you or the answer you need. If the pitch, tone, or acoustics of your ‘master voice’ are off, it can also be difficult for Siri to hear you properly or register that it is still your voice. Some common replies Siri can re-iterate back to you would be, “I’m sorry, I don’t understand you, I don’t know how to respond to that, or Can you repeat your question?” (UK Norton, 2018, pg.1). In terms of Wi-Fi connectivity, which is highly important in order for Siri to operate, this connection gives Siri the power to access the sub-platforms that are already installed within her mechanics. Without Wi-Fi, Siri isn’t able to access the Apple Server to store and collect your data or reach the networks needed in order to answer your question. When connection lessens or worsens, it becomes increasingly difficult for a speedy-accurate answer to be delivered to you normally. 

Throughout the course of the semester with Professor Irvine, some quick pro’s and con’s I personally have gathered with this technology would be: Pro’s: quick virtual help, hands-free, audio enabled, customizable, personable, free, accurate, useful when needed, and proficient. Con’s: Risk to your privacy, data is owned and accessible by Apple, the microphone is always listening to some degree in order to respond to the initial wake word, Hey Siri, and for all people especially very private people, this poses a threat to ethics of data usage, storage, and collection. All of this combined results in the next big question, Is Siri functionality ethical or unethical and does this put our privacy at risk? 

To further answer this, in the Stucke and Ezrachi 2017 study, they discuss, “The digital assistant with the users’ trust and consent will likely become the key gateway to the internet. Because of personalization and customization, consumers will likely relinquish other less personal and useful interfaces and increasingly rely on digital assistants to anticipate and fulfill their needs. They transform and improve the lives of consumers yet come at a cost” (Stucke and Ezrachi, 2017, pg. 1243). They found these types of assistants, especially Siri, follow a learning-by-doing model, and this is where the voice recognition and NLP happens that gets personally stored to each user profile. The more it is used, the more it learns about you (Stucke and Ezrachi, 2017, pg. 1249). They also say, the more someone uses Siri, the more it is able to predict the type of apps it needs to answer you, and the more it can start to personalize your data and formulate search bias (Stucke and Ezrachi, 2017, pg. 1242). Their concluding argument suggested it is nearly impossible to create an organic algorithm and not have a super-personalized experience that isn’t virtually stored and owned by the company (Stucke and Ezrachi, 2017, pg. 1296).

In a similar article, the Hoy 2018 study discusses what virtual assistants are and how they can pose a threat to privacy and need immense future regulation if they ever were to be used for other things than just your cell phone. He argues this because they already have so much access and ability to own your data, it would be extremely vast to think about the complexity and ability of Siri in a real-life large internet hosted setting. Hoy says, “Currently available voice assistant products from Apple, Amazon, Google, and Microsoft allow users to ask questions and issue commands to computers in natural language. There are many possible future uses of this technology, from home automation to translation to companionship and support for the elderly. However, there are also several problems with the currently available voice assistant products. Privacy and security controls will need to be improved before voice assistants should be used for anything that requires confidentiality” (Hoy, 2018, pg. 1). 

In relation to these studies, consumers then want to know, Does Siri actually always listen to you and what can be done about this? As impressive as this feature is in the new iOS 8 software update, where a user can say, ‘Hey Siri’ in a hands-free conversation, what they don’t know is Siri to some degree is always listening and is fully listens once woken.

As this product has improved, a recent conversation with Apple CEO, Tim Cook, took place between the House of Representatives and their legal team. The house wanted to know more of what’s really going on in their updates with Siri, the user location to pin point data, and the listening feature of what she is collecting and is this potentially harmful or against their policy (Sophos, 2018, pg.1). Tim Cook responded and said, “We are not Google or Facebook. The customer is not our product, and our business model does not depend on collecting vast amounts of personally identifiable information to enrich targeted profiles marketed to advertising” (Sophos, 2018, pg.1). To back this up further, Apple’s own director of Federal Government Affairs chimed in and wrote a formal letter that says, “We believe privacy is a fundamental human right and purposely design our products and services to minimize our collection of consumer data. When we do collect data, we’re transparent about it and work to disassociate it from the user” (Apple Response Letter, 2019, pg.1). 

An interesting feature in Apple’s response says, “The iPhone doesn’t listen to consumers, except to recognize the clear, unambiguous audio trigger ‘Hey Siri’. The on device speech recognizer runs in a short buffer and doesn’t record audio or send audio to the Siri app if ‘Hey Siri’ isn’t recognized” (Apple Response Letter, 2019, pg.1). All of this is good information to know and have the official update on, but this listening feature with Siri will always be a controversial pro and con to the technology. Even though the company is telling the world they are not creating technology designed to always listen to you, they in some other ways are saying it has the ability to always listen, especially when ‘woken’, but this isn’t so settling. Many reporters and consumers would argue big data companies, especially Apple, have the power to do this, but in terms of legal or privacy policy, they would never fully disclose to the general public that this technology is always listening. 

Even though Apple claims they nor Siri do not always listen to you, the answer to this question is still up for debate. In recent reports, other news sites would argue the opposite. In a recent USA Today article, they say, “With iOS 8, Apple introduced the ‘Hey Siri’ wake phrase, so you can summon Siri without even touching your iPhone. If you turn this feature on, this means your iPhone’s mic is always listening, waiting for the phrase, ‘Hey Siri’ to occur (USA Today, 2017, pg.1). 

In response to these reports such as this one, and similar ones, Apple claims the Siri microphone does not start to listen to you until the wake word is used, but it wouldn’t take much math involved to understand that in order for that to happen or be pronounced true, the device has to be listening to some degree in order for it to fully wake and then proceed with processing your information and provide an answer to your request. Whether the company owns up to this feature or not, either way, this poses a threat to privacy and one’s personal data with Siri. 

Looking ahead towards the future, one might then ask, Where is Siri going and what implications are needed for the future, if any? This is the most-dense component to my entire initial research question because Siri is already able to do so much, what more does she need to do? When thinking ahead for the next decade and beyond this is a mind-blowing thought process to experience. In order for the privacy threat controversy to disappear, there must definitely be more regulations reinforced in terms of more listening rights, protocol, access to full disclosure of how Siri listening fully works and is visually understood by all users who decide to turn it on. 

Another regulation to reinforce for the future would be that we cannot let virtual assistants control too much of our lives. I suggest this strongly, but looking in terms of the current projection of Siri on the Apple Siri website, it is evident that she will be running the world in other ways than just our phones. On their site they say, “Now you can control your smart appliances in your home, check their status, or even do many things at once—just using your voice. In the Home App, you can create a page ‘I’m Home’ that opens the garage, unlocks the front door, and turns on the lights” (Apple Siri, 2019, pg.1). Some common questions we can ask it to do inside our homes would be: ‘Did I close the garage, Show me the driveway camera, or tell it to redirect your smart TV remote to record a show for you when you’re not home’. (Apple Siri, 2019, pg.1). 

As if At-Home assistance like this isn’t overwhelming enough, Siri is now accessible within smart cars and newer models of cars across all brands. You can also ask questions related to your car such as, ‘Did I close my door, Where did I park, What song is this, Play 92.5 FM Radio, Answer phone call, etc’. all at the power of your voice, hands-free inside your moving vehicle (Apple Siri, 2019, pg.1). This feature is gaining a lot of speed and is now easily used and accessible to enhance not just your phone experience, but at-home, on-the-go, music, or even car related experiences. 

According to VOX Media, the future for voice assistants is looking extremely bright in terms of running the ways in which we use technology professionally, socially, economically, industrially, and personally. They pulled some statistics that say, “There are 90.1 million people who own and use smart phone technology, 77.1 million people who use it inside cars, and 45.7 million people who use it on speakers” (Vox Media, 2019, pg.1). These statics also can suggest that the future of virtual assistants, especially Siri, will become the new face of voice automated technology, and for the other categories previously mentioned from Apple. 

Personal Interpretation

As a frequent user of Siri and Apple consumer, this topic sparked many interests in my participation with their products and I wanted to learn more about Siri and how it all actually works. The most controversial part to this entire works was the listening section where Apple’s privacy statements were up for debate in recent news. From what I have gathered in further research, it’s still hazy to argue that Apple is not fully always listening to you, because most people would argue the opposite despite what they continue to legally market and disclose in their statements. 

It is increasingly difficult to know the truth regarding this matter, but what I have gathered in this AI course this semester and the research done to answer my initial research question, I could argue my hypothesis as pending true due to the fact that the technology is listening to the user to some degree in order to process the wake word. Understandable in terms of legal issues, Apple would never fully disclose this performance, but from understanding the NLP architecture and algorithmic cloud computing features, I would confidently stand on the always listening side to this argument due to the fact in the privacy statements they never even fully and clearly shut down the possibility of that being possible. 

Conclusion

Virtual Assistants are technology that is designed to enhance our lives. Siri in particular is a highly skilled AI virtual assistant that can act as a large key component to our inquiries, questions, or requests in order to achieve a certain task. In nature, the main goal of Siri is genius, and extremely convenient. However, as it continually progresses, we can find and see the leverage it is slowly gaining on our lives, not just our phones. The purpose of this progression is to keep users relying on this technology and Apple products. In this, it reinforces Siri has the answers we deeply desire, but AI and Siri in particular is taking a route that could be going too far in replacing human actions in human life.

As exciting as the future looks, all of this overlapping control and capability for all areas of our lives such as our phones, homes, cars, businesses models, software, and more is agreeably innovative, but extremely inconclusive and terrifying at the same exact time. These technologies are a privilege and we must use them to our own degree when necessary but not let it overpower the meaning of life. No artificial technology is better than real authentic life choices and actions.

Works Cited

Apple Introduces iCloud. (2019, April 05). Retrieved May 5, 2019.

Apple Response to July 9 Letter. 2019.  Retrieved May 5, 2019, from SCRIBD

Daniel Jurafsky and James H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 2nd ed. (Upper Saddle River, N.J: Prentice Hall, 2008).

Deep Learning for Siri’s Voice: On-device Deep Mixture Density Networks for Hybrid Unit Selection Synthesis – Apple. Retrieved May 5, 2019, from Apple Siri Voices.

Hey Siri: An On-device DNN-powered Voice Trigger for Apple’s Personal Assistant – Apple. Retrieved May 5, 2019, from Hey Siri.

Hey Siri: The Pros and Cons of Voice Commands. 2018. Retrieved May 5, 2019, from UK Norton Blog.

Komando, K. (2017, September 29). How to stop your devices from listening to (and saving) what you say. Retrieved May 5, 2019, from USA TODAY

Matthew B. Hoy (2018) Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants, Medical Reference Services Quarterly, 37:1, 81-88, DOI: 1080/02763869.2018.1404391

Molla, R. (2019, January 15). The future of voice assistants like Alexa and Siri isn’t just in homes – it’s in cars. Retrieved May 5, 2019, from VOX media.

Siri. (Retrieved May 5, 2019, from Apple Siri.

Stucke, M. E.; Ezrachi, A. (2017). How digital assistants can harm our economy, privacy, and democracy. Berkeley Technology Law Journal, 32(3), 1239-1300.

The Real Reason Voice Assistants Are Female (and Why it Matters). (2018, January 29). Retrieved May 5, 2019.

Today in Apple history: Siri debuts on iPhone 4s. Cult of Mac. (2018, October 04). Retrieved May 5, 2019.

Deblackbox Facebook News Feed Algorithm as A System for Attention Manipulation

Abstract

With approximately two billion monthly active users, Facebook’s News Feed is an emerging, influential information platform in our social life. This paper briefly introduces the history of Facebook News Feed algorithm and then links the algorithm to the attention economy. Machine learning, as a powerful tool of data analysis, is utilized by Facebook to optimize its ability to attract people’s attention. By unpacking Facebook News Feed algorithm, this paper is intended to illustrate the ways in which Facebook News Feed manipulates users’ attention.

Introduction

Facebook, mainly known as a social media and social networking service company, is also an important information platform for some users. This feature is embodied in Facebook News Feed, a place where you can see updates, news, and videos from the various publishers including families, friends, and media agencies. News Feed, according to Facebook, shows you the stories that are most relevant to you. Here comes a question: how does Facebook define “relevance”?

  • A Brief History of Facebook News Feed Algorithm

The top secret behind “relevance” is the News Feed algorithm. It was developed by Facebook in 2006 and has gone through tens of updates over these years. In September 2006, Facebook officially launched News Feed. Since then, Facebook News Feed algorithm has undergone several changes. In 2017, the “Like” button feature was added, which made Facebook the first platform to practice an algorithm-based news feed. Prior to the addition of the “Like” button, the only way for users to interact with the other users was to comment on a status or post.

In October 2009, Facebook took a bold step. It introduced a new type of sorting order to the algorithm. It changed the whole picture of News Feed by replacing the original chronological default with a popularity-oriented algorithm. Six years later, Facebook announced other new changes to the News Feed ranking algorithm. For example, the algorithm emphasizes each user’s most 50 interactions on the network in determining what they see in their News Feed.

In 2015, Facebook expanded the sources of posts pushed to users’ News Feed, which means users would see posts from other users not on their “Friends” lists. In 2016, Facebook launched Audience Optimization tool, allowing publishers to reach a specific audience based on interests, demographics, and geographic location. In the same year, Facebook unveiled the elements that are given more weight in computation: user interest in the creator, post performance among other users, previous content performance of the creator, type of post the user prefers, and how recent the post is (Lua, 2019). In addition, when you click on a post, Facebook measures how much time you spend on the post, even if don’t respond to it. In March 2017, Facebook revamped the News Feed algorithm and decided to weigh “Reactions” more than “Likes”. In 2019, Facebook introduced a new metric called “Click-Gap” that analyzes sites and posts on Facebook compared to the internet as a whole. If a post is perceived as only popular on Facebook and nowhere else online, its reach will be limited in the News Feed.

In the beginning, the methodology behind the News Feed was the Edgerank Algorithm, an algorithm that is used to determine the order of posts. It is an adaptation of PageRank technique, a ranking algorithm used by Google’s search engine. What differentiates these two algorithms is the context they are in. Edgerank is based on social network, while PageRank is mainly used to order web pages.

Driven by machine learning, the Facebook algorithm evolves and changes as a result of an ever-increasing set of data (Introna, 2016). As of 2011, Facebook has stopped using the EdgeRank system and uses a machine learning algorithm. The machine learning algorithm takes more than 100,000 factors into account (McGee, 2013), making it more accurate in predicting what users want to see in their News Feed.

  • Attention Economy

Attention economy, first theorized by Simon, focuses on how people’s limited attention is allocated among content (Simon, Deutsch & Shubik, 1971). It is a result of the rise of the attention industry that eventually came into reality in the past century. An overwhelming amount of information stimuli compete for people’s cognitive resources, giving rise to attention economy (Shapiro & Varian, 2007). Attention economy can be traced back to the nineteenth century when the first newspaper fully dependent on advertisers was created in New York. However, the business model that converts attention into revenue had not been fully realized until the twentieth century. As soon as the digital era has come, communication technologies provide everyone with a loudspeaker, allowing content to be distributed worldwide without any effort.

At present, the concept of attention economy has invaded into every facet of our lives. Consumers can be affected in many aspects, such as what to think and buy, the outcome of elections, and political discourse (Huberman, 2017). Tim Wu (2017), the author of The Attention Merchant,said in his book that the attention industry has asked and gained more and more of our waking moments in exchange for new conveniences and diversions, creating a grand bargain that has transformed our lives.

In the era of attention economy, big tech companies like Facebook take advantage of their high technology to hold people’s attention on their platform and then make money from advertisers. The engineers continuously adjust the weights of the algorithm to keep pace with Facebook’s business model and maximize revenue.

Machine Learning and Facebook News Feed Algorithm

Facebook has a large data set featuring 100 billion ratings, more than a billion users, and millions of items (Kabiljo & Ilic, 2017). With the rapid expansion of data set, machine learning is increasingly essential to Facebook algorithm because machine learning especially features the expertise of dealing with an incredible amount of data. Machine learning is a method of data analysis that aims to construct a program that fits the given data (Alpaydin, 2016). It is an important branch of artificial intelligence.

System map of FBLearner

Driven by the business model, Facebook News Feed algorithm is designed and optimized to activate engagement. On Facebook, users’ engagement can be determined by many factors, such as view, click, like, comment, and share. By treating each factor as a metric, Facebook can treat the News Feed as a machine learning problem, where the inputs are various content on Facebook, and the output is the probability of an engagement event. According to an official document released by Facebook, general models are trained to determine various user and environmental factors that ultimately determine the rank order of content. When a user opens Facebook, the model generates a personalized set of the most relevant posts, images, and other content to display from thousands of publishers, as well as the best ordering of the chosen content (Hazelwood et al., 2018). The ML models implanted in prediction and ranking algorithm are illustrated in the following paragraphs.

  • Ranking Algorithm

The Ranking algorithm is a process that ranks all available posts that can display on a user’s News Feed based on how likely the user will respond positively. To be specific, the algorithm calculates the ranking score of an event based on two factors: probability and value.

Probability presents the chance that users will react to the story as each event suggests; value represents the weight given to an event.

Ranking Score Calculation Table. Retrieved from https://learning.oreilly.com/videos/the-artificial-intelligence/9781492025979/9781492025979-video320235

Here is an example of the ranking model. There is an 11% chance that users will click on the post and 2.2% probability that users will like it. Also, there is 0.099% chance that users will hide the story. Each event is given a weight based on its importance. A weighted ranking score will be generated as the final ranking score, which is 0.2277. The posts in the inventory will be ranked according to their final score.

All the probability column is calculated by machine learning models and the value column is based on the user study and product decision. Since interaction is more valued in the algorithm, the weights of “Like” and “Comment” are much higher than “Click”, while the weight of “Hide” is negative.

In 2016, Facebook stated the “core values” it uses when determining what shows up in a user’s feed. Facebook emphasized that the posts from friends and family will be on the top of one’s News Feed, followed by the posts that “inform” and posts that “entertain.” Other core values include posts that represent all ideas and posts with “authentic communication.” Facebook also claimed that it emphasized the user’s ability to hide posts, and the user’s ability to prioritize their own feed with the “See First” function.

  • Decision Tree Models

Decision trees are commonly used as a predictive model, mapping possible outcomes of a series of related choices. The decision tree is one of the oldest methods in machine learning. Also, it is one of the most common predictive modeling approaches used in machine learning due to its non-linearity and fast evaluation (Ilic & Kuvshynov, 2017).

Decision trees are a type of supervised machine learning in which the input and the corresponding output are previously labelled. The decision tree is used to find the most similar training instances by a sequence of tests on different input attributes. It is a flowchart-like structure in which each decision node represents a class label, each branch represents the outcome that leads to those class labels (Alpaydin, 2016). Each decision node applies a splitting test on an attribute and one of the branches is taken based on the outcome. The paths from root to leaf represent classification rules. The leaves represent the decisions or final outcomes. When the flow reaches leaves, the search stops. Through a set of procedures, we can find the most similar training instances and get the probability of each instance.

Figure 3. An example: the probability of clicking on a notification. Retrieved from https://code.fb.com/ml-applications/evaluating-boosted-decision-trees-for-billions-of-users/

Decision tree models are based on the idea that a user’s future behavior is generally consistent with his or her past actions. It is a powerful model in predicting and it is currently implanted in Facebook News Feed Algorithm. The figure above shows an example of a simple decision tree that generates the probability of clicking on a notification. This decision tree has the following attributes: 1) the number of clicks on notifications from a specific user today; 2) the number of likes that the story from the notification has; 3) the total number of notification clicks from this specific user. The input data goes through the decision nodes and the values of data input are checked according to the parameters. Eventually, we can get the probability of clicking on a notification. With the decision tree, we can predict the probability of the user clicking on the other notifications in the future. Decision trees can also be used in predicting the probability of clicking on ads in the News Feed.

  • Collaborative Filtering

Collaborative filtering (CF) is a recommendation system that helps people discover items that are most relevant to them. It is based on the idea that the best recommendations come from people who share similar interests (Kabiljo& Ilic, 2017). Collaborative filtering is commonly implanted in e-commerce applications and online news aggregators. Facebook has a Collaborative Filtering recommender system that is used in many areas of the site.

There are three types of CF: User-based collaborative filtering, Item-based collaborative filtering, and Model-based collaborative filtering. The difference between these three CF recommendation systems is nuanced. User-based collaborative filtering firstly finds neighbors who share similar interests with the targeted user by comparing the posts they liked, then recommends posts to the targeted user based on the preferences of the neighbors. Item-based collaborative filtering calculates the similarity score of two posts based on all users’ reaction to them, and then recommend to the targeted user the posts that fit his preference. Model-based collaborative filtering trains a model with input data that is extracted from targeted user’s prior reactions and, later, predicts his or her future actions with the built model. Collaborative filtering-based recommendation is different from Content-based Recommendation because Collaborative Filtering recommendation systems connect a post with those who liked the post, instead of just focusing on the post itself.

Figure 4. Facebook Collaborative Filtering. Retrieved from https://code.fb.com/core-data/recommending-items-to-more-than-a-billion-people/

In the example, Facebook uses Apache Giraph to analyze the social graph formed by users and their connections. Apache Giraph is an iterative graph processing system built for big data. It is able to break down the complicated structure and find the most relevant posts to the targeted user based on the results generated by Collaborative Filtering.

Attention Manipulation

Attention manipulation is a strategic action to influence how a user allocates his or her attention. When a consumer’s attention is limited, her ultimate purchasing decisions may hinge on what she pays attention to, which in turn incentivizes firms to engage in attention manipulation (Persson, 2017).  In order to achieve the goal of converting users’ attention to revenue, News Feed changes the way in which people receive information, which also influences what they see about the world around them. Algorithms are not just abstract computational processes, they also have the power to enact material realities by shaping social life to various degrees (Beer, 2013; Kitchin & Dodge, 2011).

As a result of attention economy, News Feed brings problems to users in three aspects. First of all, this manipulation is without consent. They didn’t ask users whether they were willing to participate in it. Users are automatically giving their consent to these kinds of attention manipulation when they sign up for Facebook and click the button “I agree on the user agreement”. Facebook does not provide users the option of not using News Feed algorithm, instead, it assumes that users are comfortable with it. The lack of consent deprives a rightful choice of users.

The second problem is the loss of agency. People are getting accustomed to being fed with the recommended messages, which is a big concern. The News Feed algorithm is trying to make decisions on users’ behalf. After interviewing 25 Facebook users, researchers found that several participants expressed their unease and discomfort about their perception of Facebook algorithm controlling what they see and do not get to say (Bucher, 2017). The feeling of being controlled always comes along with the loss of agency. The algorithm, instead of users themselves, decides what information they receive on Facebook, which might jeopardize people’s ability to think about the most relevant information they need.

Another problem resulting from the News Feed as an approach of attention manipulation is emotional contagion. Big tech companies like Facebook are trying to lead people to experience emotions without their awareness. According to a new study by social scientists at Cornell, the University of California, San Francisco (UCSF), and Facebook, emotions can spread among users of online social networks (Segelken & Shackford, 2014). People with lesser judgment are more vulnerable to these kinds of attention manipulation.

Conclusion

Since the inception of Facebook News Feed algorithm, it has undergone great changes to be consistent with Facebook’s business model. Although Facebook opened the curtains for the algorithm, it is still difficult for ordinary users to learn about hundred thousands of weights in the algorithm. Due to the incredible scale of Facebook’s data set, the algorithm nowadays is built with machine learning models, which is featured by the ability to handle big data.Unpacking the Ranking algorithm, Decision tree model, and Collaborative filtering helps to get deeper into how the algorithms work.

Facebook News Feed has become the world’s biggest information distribution platform. By now, there have been a variety of types of content on News Feed: text, photo, video, event, and groups. The diversity requires more complexity of the ranking algorithm; for outsiders, it brings a greater challenge to debalckbox the algorithm.

The real problem is that there is much less accessible information concerning the parameters. As the News Feed algorithm starts to supplant traditional editorial story selection, it is hard for users to get into its story curation system that is parallel to our knowledge of the principles that the editors used to refer to (DeVito, 2017). Education that teaches how to get rid of attention manipulation has not been in place for now.

But it is time to act. As Carl Newport (2019) suggests, you should transform the way you think about the different flavors of one-click approval indicators that populate the social media universe. The first rule is to learn about how it works and never fall into the trap.

 

References

Alpaydin, E. (2016). Machine learning. Cambridge, Massachusetts ; London, England: The MIT Press.

DeVito, M. A. (2017). From editors to algorithms.Digital Journalism, 5(6), 753-773. doi:10.1080/21670811.2016.1178592

Beer, D. (2013). Popular culture and new media. Basingstoke [u.a.]: Palgrave Macmillan.

Bucher, T. (2017). The algorithmic imaginary: Exploring the ordinary affects of facebook algorithms.Information, Communication & Society, 20(1), 30-44. doi:10.1080/1369118X.2016.1154086

Facebook newsfeed algorithm history. Retrieved from https://wallaroomedia.com/facebook-newsfeed-algorithm-history/

Hazelwood, K., Bird, S., Brooks, D., Chintala, S., Diril, U., Dzhulgakov, D., . . . Wang, X. (Feb 2018). Applied machine learning at facebook: A datacenter infrastructure perspective. Paper presented at the 620-629. doi:10.1109/HPCA.2018.00059 Retrieved from https://ieeexplore.ieee.org/document/8327042

Huberman, B. (2017). Big data and the attention economy.Ubiquity, 2017(December), 1-7. doi:10.1145/3158337

Ilic, A., & Kuvshynov, O. (2017). Evaluating boosted decision trees for billions of users. Retrieved from https://code.fb.com/ml-applications/evaluating-boosted-decision-trees-for-billions-of-users/

Introna, L. D. (2016). Algorithms, governance, and governmentality.Science, Technology, & Human Values, 41(1), 17-49. doi:10.1177/0162243915587360

Kabiljo, M., & Ilic, A. (2015).Recommending items to more than a billion people. Retrieved from https://code.fb.com/core-data/recommending-items-to-more-than-a-billion-people/

Kitchin, R., & Dodge, M. (2014). Code/space(1. MIT Press paperback edition ed.). Cambridge, Mass. [u.a.]: MIT Press.

Lua, A.Decoding the facebook algorithm: A fully up-to-date list of the algorithm factors and changes. Retrieved from https://buffer.com/library/facebook-news-feed-algorithm

McGee, M. (2013). EdgeRank is dead: Facebook’s news feed algorithm now has close to 100K weight factors. Retrieved from https://marketingland.com/edgerank-is-dead-facebooks-news-feed-algorithm-now-has-close-to-100k-weight-factors-55908

Newport, C. (2019). Digital minimalism. UK ; USA ; Canada ; Australia ; India ; New Zealand ; South Africa: Penguin Business.

Persson, P. (2017). Attention manipulation and information overload. Cambridge, MA: National Bureau of Economic Research.

Segelken, H. R., & Shackford, S. (2014). News feed:‘Emotional contagion sweeps Facebook. Cornell Chronical. Retrieved fromhttp://news.cornell.edu/stories/2014/06/news-feed-emotional-contagion-sweeps-facebook

Shapiro, C., & Varian, H. R. (2007). Information rules(Nachdr. ed.). Boston, Mass: Harvard Business School Press.

Simon, H. A., Deutsch, K. W., & Shubik, M. (1971). Designing organizations for an information-rich world., 37-72. Retrieved from http://www.econis.eu/PPNSET?PPN=487583434

Yves Citton, & Barnaby Norman. (2017). The ecology of attention(English edition. ed.). GB: Polity. Retrieved from https://ebookcentral.proquest.com/lib/[SITE_ID]/detail.action?docID=4788164

Wu, T. (2017). The attention merchant

Is your head in the Cloud?

CCTP-607 “Big Ideas” in Technology (and What They Mean): AI to the Cloud
Professor: Dr. Martin Irvine
Georgetown University, Spring 2018
By: Linda Bardha

Abstract

These days, if you try to google the word “cloud” you’re more likely to get articles referencing “cloud computing” than you are finding a reference to the weather. It turns out that the Cloud isn’t even a new concept, but media embraced this terminology and is using it excessively. So, what exactly is the Cloud? How did we start using it and what are the design principles that lead to the concept of Cloud computing? Cloud computing has been one of those buzz words in both consumer and enterprise technology, especially over the last few years. Companies seem to use this term to attract more customers and to show that they are “in” with the latest technology that there is. But how much of it is real and how much is just a lot of hype? The cloud is an example of a buzzword taking over the concepts that were established and used since the 1960s. We will take a look at the history of cloud computing, it’s architecture and design principles, how it is changing the ways that we access, compute and store information using the Internet and what are some of the implications of using a cloud-based system?

  1. Introduction

Think about different activities that you did today that required you to use the Internet. Did you catch up with your emails? Did you check Facebook or Twitter and interact with your friends? Or did you just watch a movie on Netflix, or listen to some music on Spotify? If you did any of these activities (or all) you have used the benefits of Cloud computing. When we’re referring to the Cloud, in a way, we’re referring to the Internet. In a broad sense, the internet is a worldwide network of billions of devices that communicate with each-other. As Irvine suggests, the design principles for Cloud computing systems extend the major principles of massively distributed, Web-deliverable computing services, databases, data analytics, and, now, AI/ML modules. The term “Cloud” began as an intentional, “black box” metaphor in network engineering for the distributed network connections for the Internet and Ethernet (1960s-70s). The term was a way of removing the complexity of connections and operations (which can be any number of configured TCP/IP connection in routers and subnetworks) between end-to-end data links. Now the term applies to the many complex layers, levels, and modules designed into online data systems mostly at the server side. The whole “server side” is “virtualized” across hundred and thousands of fiber-optic linked physical computers, memory components, and software modules, all of which are designed to create an end product (what is delivered and viewed on screens and heard through audio outputs) that seems like a whole, unified package to “end users.”

  2. History of the Cloud and how it all started

How did the idea of Cloud computing develop?

To start from the beginning, we have to go all the way back to the 1950’s with the invention of mainframe computing. Mainframe computing is the concept of having a central computer, accessed by numerous user devices (Ebbers, Mike; O’Brien, W.; Ogden, B., 2006). The central computer, which had all the compute capabilities, was called the mainframe computer. All the user devices, which sent requests up to the mainframe computer, were called dumb terminals.

Mainframe Computer Concept

In schools and companies, we’ve seen that we have computers at every desk, fully independent from the ones around it. Back in the 50’s, however, mainframe computers were extremely expensive to buy, and maintain. So, instead of placing one at every seat, organizations would buy one mainframe computer, and allow the dumb terminals to share its computing resources. In the 70’s, the concept of virtual machines emerged.

Virtual machines are multiple complete operating systems that live in a single piece of hardware (Ebbers, Mike; O’Brien, W.; Ogden, B., 2006). For example, you can have multiple Windows Virtual Machines living in your single laptop. Suddenly, one single mainframe computer could have multiple operating systems running at the same time, to do many different things. This was the beginning of the modern concept of cloud computing.

To make this a reality, developers created a software called a hypervisor, that could be installed onto multiple pieces of hardware, such as servers. They could then link all these hardware and use the combined computational and storage powers as one giant resource. Take a moment to imagine all the amount of storage and computing power that was created by adding up all the memory and hard drive space for every computer in your office.

Hypervisors in Virtualization, By T. Sridhar

As Sridhar suggests, benefits of virtualization in a cloud-computing environment are:

  • Elasticity and scalability: Firing up and shutting down VMs involves less effort as opposed to bringing servers up or down.
  • Workload migration: Through facilities such as live VM migration, you can carry out workload migration with much less effort as compared to workload migration across physical servers at different locations.
  • Resiliency: You can isolate physical-server failure from user services through migration of VMs.

Programs would run faster, and you can store a lot of files. This is what cloud computing allows people to do, in an extremely large scale, using the Internet to connect end users to huge computational hardware.

   2.1 What was used before “the Cloud”?

We were used to storing files on a flash drive, and I remember many occasions when I would forget the USB and not have access to those files that I needed. Cloud computing allows you to access your data as long as you have an internet connection.

Hiroko Nishimura, a cloud computing instructor for Lynda explains the process of storing and having access to information for an organization, before the idea of the cloud:

“In the traditional way, you would have to go through the procurement process at your work to find an appropriate server with all the necessary bells and whistles. You would then have to make sure the capacity you’re purchasing isn’t too much, or too little. Then you have to get the quote from the manufacturer and then wrestle with the finance department to get the budget approved and device purchased. If the demands from the applications are much higher or lower than expected, you have to go back and go through the procurement process again to get a more appropriate server. Your department or company needs to have the funds to then purchase the equipment outright.

Cloud computing allows you to pay to use only as much server space and capacity as you need at that moment. When you need more or less, you can adjust the rented capacity and your monthly bill will adjust along with it. Instead of a large overhead bill on purchasing a piece of hardware that may or may not even match your needs, you get a monthly statement billing you only for as much as you used last month. It allows for increased flexibility and affordability because you are charged only for what you consume, when you consume. This allows what used to only be possible with big corporate IT budgets to almost anyone with internet access and a few dollars.”

  3. What exactly is Cloud computing?

As Rountree and Castrillo explain, there has been a lot of debate about what the cloud is. Many people think of the cloud as a collection of technologies. It’s true that there is a set of common technologies that typically make up a cloud environment, but these technologies are not the essence of the cloud. The cloud is actually a service or group of services. This is partially the reason that the cloud has been so hard to define. The National Institute of Standards and Technology (NIST) shares the technical definition as: “Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (for example, networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”

Cloud computing is a paradigm that allows on-demand network access to shared computer resources.  It is a way for managing, storing and processing data online via the Internet. (Rountree and Castrillo, 2014). Through cloud computing services you have instantaneous access to computational, storage and software using the Internet.

Arun Chandrasekaran, research manager at Frost & Sullivan says that, “There is a growing awareness among consumers and enterprises to access their information technology (IT) resources extensively through a “utility” model, a development broadly called “Cloud Computing.” Cloud represents the next wave in the computing industry, as it strives to eliminate inherent inefficiencies in the existing IT architecture and deliver “IT as a service” to the end-users.”

  3.1 Design Principles of Cloud Computing

In order to understand the Cloud, we need to take a look at the design principles of “Cloud Computing”. That is essential in trying to “de-blackbox” this concept. Irvine explains that many of the computing systems that we use every day are now integrated on platforms (a systems architecture for data, communications, services, and transactions) designed for convergence (using combinatorial principles for making different systems and technologies interoperable) able to exchange and make use for data, information, and AI/ML data analytics.

For organizations and business on the supply-side of information and commercial services, subscribing to a Cloud Service provides one bundle or suite of Web-deliverable services that can be custom-configured for any kind of software, database, or industry-standard platform (e.g., the IBM, Amazon AWS, and Google Cloud services).

Internet-based computing continues to scale and extend to many kinds of online and interactive services. Many services we use every day are now managed in Cloud systems with an extensible “stack” architecture (levels/layers) all abstracted out of the way from “end users” (customers, consumers) — email, consumer accounts and transactions (e.g., Amazon, eBay, Apple and Google Clouds for data and apps), media services (e.g., Netflix, YouTube, Spotify), and all kinds of file storage (Google app files) and platforms for Websites, blogs, and news and information.

Sridhar, in his article published in the Internet Protocol Journal explains some of the characteristics of a cloud-computing environment. As he mentions, not all characteristics may be present in a specific cloud solution.

Cloud computing characteristics include:

  • Elasticity and scalability: Cloud computing gives you the ability to expand and reduce resources according to your specific service requirement. For example, you may need a large number of server resources for the duration of a specific task. You can then release these server resources after you complete your task.
  • Pay-per-use: You pay for cloud services only when you use them, either for the short term (for example, for CPU time) or for a longer duration (for example, for cloud-based storage or vault services).
  • On demand: Because you invoke cloud services only when you need them, they are not permanent parts of your IT infrastructure—a significant advantage for cloud use as opposed to internal IT services. With cloud services there is no need to have dedicated resources waiting to be used, as is the case with internal services.
  • Resiliency: The resiliency of a cloud service offering can completely isolate the failure of server and storage resources from cloud users. Work is migrated to a different physical resource in the cloud with or without user awareness and intervention.
  • Multitenancy: Public cloud services providers often can host the cloud services for multiple users within the same infrastructure. Server and storage isolation may be physical or virtual—depending upon the specific user requirements.
  • Workload movement: This characteristic is related to resiliency and cost considerations. Here, cloud-computing providers can migrate workloads across servers—both inside the data center and across data centers (even in a different geographic area). This migration might be necessitated by cost (less expensive to run a workload in a data center in another country based on time of day or power requirements) or efficiency considerations (for example, network bandwidth). A third reason could be regulatory considerations for certain types of workloads.

Cloud Computing Context, By Sridhar

  3.2 The Cloud Architecture

In order to better understand Cloud computing, we will look at the four Cloud Deployment Models:

  • Public – All the systems and resources that provide the service are housed at an external service provider
  • Private – The systems and resources that provide the service are located
    internal to the company or organization that uses them
  • Community – Community clouds are semi-public clouds that are shared between members of a select group of organizations.
  • Hybrid – A hybrid cloud model is a combination of two or more other cloud models

Cloud computing provides different services based on three delivery configurations. When they are arranged in a pyramid structure, they are in the order of SaaS, PaaS, and IaaS.

The Cloud Pyramid

The Cloud architecture: a model for integrating the “whole stack” of networked computing

  1. SaaS or Software-as-a-Service — This is the layer the end-users face and it provides the functionality these users demand: social media communication, collaboration on documents, catching a taxi or booking a room for a night. This layer offers a limited set of functionalities and literally no control over the computing resources. Nevertheless, the end users get what they came for — functionality.
  2. PaaS or Platform-as-a-Service — an underlying level of APIs and engines allowing the developers to run their apps. This is a layer where the AWS or Azure users leverage the platform functions (like the latest batch of tech AWS introduced during their re:Invent week 2017). This level of the cloud pyramid allows the developers configure the resources needed to run their apps within the limits set by the cloud platform. This level demands to have some understanding of the processes and structure of your cloud, at least to be able to tick the appropriate boxes in the dashboard of said cloud service provider (CSP).
  3. IaaS or Infrastructure-as-a-Service — the lowest level of the cloud services, where the DevOps engineers work with the tools like Terraform, Docker, and Kubernetes to provision the servers and configure the infrastructures, processes, and environments, enabling the developers to deploy their software, APIs, and services. This layer might work with the hardware provided by cloud service providers like AWS or GCP or with on-prem bare metal Kubernetes clusters running in private or hybrid clouds. This level provides the most capabilities (like load balancing, backups, versioning and restoration of an immutable infrastructure) yet requires the most skills to be operated correctly.

Now that we have a better understanding of Cloud Computing, let’s look at some companies that offer Cloud Computing services:

iCloud – Cloud from Apple is for Apple products. You can backup and store everything from multimedia to documents online. The content is then smoothly integrated onto your devices.

Amazon’s AWS – When you talk about companies using cloud computing, Amazon Web Services leads the pack. It offers IaaS and PaaS to all their customers.

Google Cloud – This cloud platform is universal for Google’s enormous ecosystem and for other products such as Microsoft Office. It provides storage of data and collaboration along with other services that are included in their cloud computing suite.

Microsoft Azure – Offered by Microsoft, it provides SaaS, PaaS, and IaaS for its software and developer tools. If you have used Office 365, then you have used SaaS.

IBM Smart Cloud – This offers private, public, and hybrid distribution platforms providing a full range of SaaS, PaaS, and IaaS cloud computing services for businesses. The pay as you go platform generates profits for IBM.

  3.3 Advantages of cloud computing

There are six major advantages of cloud computing:

  1. You can trade capital expenses for variable expenses,
  2. Benefit from massive economies of scale,
  3. Stop guessing about capacity,
  4. Increase speed and agility,
  5. Stop spending money running data centers, and
  6. You can go global in minutes.

You no longer have to worry about buying too little or too much of something, and you only pay as you go based on the serviced that you need.

 3.4 Implications on cloud-based systems

Cloud-computing technology is still evolving.  Even though Cloud computing changed the way that we access, compute and store information using the Internet, there are some major concerns that need to be considered and kept in mind when using cloud-based services from different platform that over this system. Some of the major concerns that Sridhar explains are:

  • Security: Security is a significant concern for enterprise IT managers when they consider using a cloud service provider. Physical security through isolation is a critical requirement for private clouds, but not all cloud users need this level of investment. For those users, the cloud provider must guarantee data isolation and application security (and availability) through isolation across multiple tenants. In addition, authentication and authorization of cloud users and encryption of the “network pipe” from the cloud user to the service provider application are other factors to be considered.
  • Privacy: If a client can log in from any location to access data and applications, it’s possible the client’s privacy could be compromised. Cloud computing companies will need to find ways to protect client privacy. One way is to use authentication techniques such as user names and passwords. Another is to employ an authorization format — each user can access only the data and applications relevant to his or her job.
  • Cloud-to-cloud and Federation concerns: Consider a case where an enterprise uses two separate cloud service providers. Compute and storage resource sharing along with common authentication (or migration of authentication information) are some of the problems with having the clouds “interoperate.” For virtualized cloud services, virtual machine migration is another factor to be considered in federation.
  • Legal and regulatory concerns: These factors become important especially in those cases involving storing data in the cloud. It could be that the laws governing the data are not the laws of the jurisdiction where the company is located. Does the user or company subscribing to the cloud computing service own the data? Does the cloud computing system, which provides the actual storage space, own it? Is it possible for a cloud computing company to deny a client access to that client’s data?

Because of the nature of these services and the implications that come with that, companies, law firms, businesses and universities are debating these and other concerns about the nature of cloud computing and what should be done to address these concerns.

  4. Conclusion

Cloud computing is present in many aspects of our daily lives as we utilize internet for work, school, and personal life. We check emails, post on social media, share documents via online file sharing services, stream hours and hours of videos, and use cloud-based car navigation apps. Even though the media uses the Cloud as a buzzword, referring to a new technology, we need to keep in mind that the concept of cloud computing was firstly used in the 1960s, with the development of mainframe computers, accessed by dump terminals. 

With the developments of virtualization we changed the way that computers and devices communicated with each-other, pulling together multiple servers, and using all of their compute and storage resources together as if we’re using one extremely large server. While in the past, the amount of resources you could link was limited by what was in your physical data center, with cloud computing, you have the ability to access as much resources as the service provider can give you.

With the internet, there is so much potential computing power. It’s almost as though we’re back in the 60s, with our laptops serving as the dump terminals and our cloud computing service providers hosting the mainframe computers. Instead of connecting to the mainframe computer in the data center on the same floor, we use the internet to connect to the countless servers linked by hypervisors through big service providers, like Amazon Web Services, Microsoft Azure, and Google Cloud. Cloud computing changed the way that we access, compute and store information via Internet. Cloud computing as a technology is still evolving and there are some major concerns in terms of privacy and security, but cloud computing adds a lot of value as a solution for several IT requirements.

References

 

Bruin, D. Boudewijn & Luciano Floridi, “The Ethics of Cloud Computing,” Science and Engineering Ethics 23, no. 1. February 1, 2017: 21–39.

Chandrasekaran, Arun “Cloud computing and market insight” Frog and Sullivan. 2010 Available at http://www.frost.com/prod/servlet/market-insight-print.pag?docid=207327187

Cloud Computing Services Models – IaaS PaaS SaaS Explained (EcoCourse)

Ebbers, Mike; O’Brien, W.; Ogden, B. . “Introduction to the New Mainframe: z/OS Basics” (PDF). 2006. IBM International Technical Support Organization

How Cloud buzz has conquered media. 2019. Available at https://royal.pingdom.com/how-cloud-buzz-has-conquered-media/

Irvine, Martin. What is Cloud Computing? AI/ML Applications Now Part of Cloud Services?

Nishimura, Hiroko. “Design principles of cloud computing”. Lynda Available at https://www.lynda.com/Amazon-Web-Services-tutorials/Design-principles-cloud-computing/808676/5036857-4.html

NIST Working Definition of Cloud Computing, http://csrc.nist.gov/groups/SNS/cloud-computing/index.html

Roundtree, Derrick & Ileana Castrillo. The Basics of Cloud Computing: Understanding the Fundamentals of Cloud Computing in Theory and Practice. Amsterdam; Boston: Syngress / Elsevier, 2014.

Ruparelia B. Nayan. Cloud Computing Cambridge, MA: MIT Press, 2016

Sridhar, T.  Cloud Computing – A Primer – The Internet Protocol Journal, Volume 12, No.3. 2016 Available at https://www.cisco.com/c/en/us/about/press/internet-protocol-journal/back-issues/table-contents-45/123-cloud1.html

Strickland, J. How Cloud Computing Works. 2018. Available at https://computer.howstuffworks.com/cloud-computing/cloud-computing3.htm

Vladimir Fedak. The Medium, What is the Cloud Computing Pyramid: The layers of DevOps Services – https://medium.com/@FedakV/what-is-the-cloud-pyramid-the-layers-of-devops-services-730ac137e8b8

The Cloud Architecture: The Cloud Computing reference model (2017) Available at https://cloudman.fr/2017/10/31/the-cloud-computing-reference-model/

Wang. “Enterprise cloud service architectures”

2.3 Billion vs. 15,000: Content Moderation Strategies and Desires on Facebook

INTRODUCTION

Try to imagine 2 billion of anything. It’s genuinely too hard for the human brain to comprehend such a gargantuan number, and yet Facebook serves 2.27 billion monthly users (Abbruzzese). With the dizzying array of opinions and demands that 2.27 billion global users have of the platform, how does Facebook decide who to please and why to please them?  How does Facebook moderate the massive amount of user-generated content on its platform? How is artificial intelligence used to automate content moderation on Facebook?

Why does Facebook moderate content?

First and foremost, Facebook is a business that aims to make a profit. Most of Facebook’s revenue is gained from selling advertisements to third parties, as Mark Zuckerberg concisely explained during a congressional hearing in April 2018:

What we allow is for advertisers to tell us who they want to reach, and then we do the placement. So, if an advertiser comes to us and says, ‘All right, I am a ski shop and I want to sell skis to women,’ then we might have some sense, because people shared skiing-related content, or said they were interested in that, they shared whether they’re a woman, and then we can show the ads to the right people without that data ever changing hands and going to the advertiser. (Gilbery)

Within this business model, one can summarize the ultimate goal of Facebook in its relationship with its users into two steps: (1) To keep users engaged and on the platform so that the advertisements can be seen and engaged with and (2) to promote users to generate content so that Facebook can extract more detailed behavioral insights with which to target ads. Basically, Facebook operates within the model of surveillance capitalism to make a profit (Laidler).

Thus, Facebook has a bona fide economic incentive to maximize the number of its users who feel that they are safe to post what they want without fear of censorship or peer-mediated attack. Additionally, as Facebook is a network in which users create the content that other users consume,  Facebook needs users to trust that they will not be offended each time they open the site. These incentives to maximize the amount of user-generated content on its platform are reflected in the descriptions of the principles that Facebook includes within its public-facing Community Standards.

When discussing the concept of safety and why it’s important to Facebook, Facebook says “People need to feel safe in order to build community,” suggesting that the reason threats and injurious statements are not welcome on the platform is because this type of content chills the process of community formation (Community Standards). The concept of increasing the number of opinions and ideas that can exist on the platform again resurfaces in Facebook’s description of “Voice” as a defining principle of its community standards. Facebook states, “Our mission is all about embracing diverse views. We err on the side of allowing content, even when some find it objectionable, unless removing that content can prevent a specific harm.”

So, economically at least, there is a reason that Facebook is heavily invested in content moderation. Facebook wants its platform to be a pleasant place to retain users so that Facebook can sell more ads.

Issues of free speech

This section could be short. Constitutionally in the United States, Facebook has no legal mandate to remove or allow speech on its platform. On the internet, many interactive service providers, also referred to as platforms, have traditionally backed the idea that the internet should be a place open to free expression and the marketplace of ideas, but they have no legal mandate to do so (Snider).

The First Amendment states that “Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech.” Notably, the First Amendment does not protect citizens from the actions of other private parties. The Supreme Court confirms this interpretation in Hurley v. Irish- American Gay Group of Boston 515 U.S. 557, 566 (1995), in which it states, “the guarantees of free speech . . . guard only against encroachment by the government and ‘erect] no shield against merely private conduct.’” Even though the first amendment does not explicitly protect users from having their speech censored by Facebook, users are still angry and invoke rhetoric that suggests their rights are being mutilated when Facebook censors them.

Alex Abdo, a senior staff attorney with the Knight First Amendment Institute references an idea of a “broader social free speech principle” and summarizes this frustration as the result of a societal expectation in the United States: “There is this idea in this country, since its founding, people should be free to say what they want to say” (Snider).

The lines between Facebook censorship and government censorship are understandably blurred when even the Supreme Court of the United States uses language that equates social media sites to the traditional concept of the “public forum.” In PACKINGHAM V. NORTH CAROLINA 137 S.Ct. 1730 (2017), a case in which North Carolina made accessing many common social media sites a felony for registered sex offenders, the Supreme Court extended the idea that in many ways, contemporary social media acts as a public forum: “While in the past there may have been difficulty in identifying the most important places (in a spatial sense) for the exchange of views, today the answer is clear. It is cyberspace—the ‘vast democratic forums of the Internet’ in general, Reno v. American Civil Liberties Union, 521 U.S. 844, 868 (1997), and social media in particular.” (Grimmelmann, 2018).

Section 230 immunity

Compounding this societal confusion around the legality of content moderation decisions on Facebook is the immunity that Facebook receives under Section 230 of the Communications Decency Act. Under Section 230, Facebook is completely immune from liability for most non-illegal, non-copyright that users upload onto its platform. Additionally, Section 230 contains a “Good Samaritan” provision that allows “interactive computer services” to take somewhat of an editorial stance by removing content that they deem offensive in their platforms without accumulating any liability (COMMUNICATIONS DECENCY ACT, 47 U.S.C. §230). This law is the reason that Facebook can “develop their own community guidelines and enforce them as they see fit” (Caplan).

Mounting pressure for Facebook to do something

No legal imperative exists for Facebook to moderate its content outside of copyrighted and illegal content. The legal imperative, however, is not what many users think about, and as we’ve established that Facebook makes its money from its users, it is easy to understand that Facebook will want to listen to their demands.

There is an increasing anxiety about Facebook’s massive scope coupled with the dependence people have on Facebook and the platform’s ultimate ability to filter speech. As the Supreme Court alluded to in Packingham v. North Carolina, the internet – especially social media sites – have become the main place for people to express themselves and their ideas in contemporary society. With over 2 billion users, Facebook is larger than many sovereign countries and has ultimate power over all user content, and users have no source of structural system to contest decisions or advocate for themselves within this system.

Additionally, following the 2016 United States presidential election, many people were frustrated with Facebook’s apparent negligence in controlling the spread of misinformation on the platform. Users demanded that Facebook do more to stop the spread of false information on the platform because in this role of a news delivery source – in which Facebook’s newsfeed algorithm makes editorial choices about what to show a user –  rather than strictly a social networking platform, some users believe that Facebook has the obligation to ensure a certain standard of news content. Following the political violence in Myanmar and the rise of white nationalism on the platform to name a few instances, some people are also calling for Facebook to do more to moderate content that contributes to political radicalization (Mozur).

How does Facebook currently moderate content?

Facebook uses artificial intelligence tools like machine vision and language processing to flag content that might violate its Community Standards, and from this point, the content is sent to one of the company’s over 15,000 human moderators (Community Standards). At the root of Facebook’s content moderation execution, separate from its public-facing Community Standards, is an ad-hoc system of PowerPoint slides that contain rules which attempt to distill ethically and politically vague dilemmas into binary moderation decisions. This simplification of difficult moderation decisions is part of an attempt to uniformly train its over 15,000 human content moderators to deal with the absolute avalanche of content that needs to be checked each day on Facebook.

Some of these moderators are contracted workers with Facebook and may be forced to work on moderating content that is in a language that the moderator does not understand or situated within a foreign country. The New York Times reported that the rules moderators get to execute moderation decisions are “apparently written for English speakers relying on Google Translate, suggesting that Facebook remains short on moderators who speak local languages” (Fisher).

In addition to the Powerpoint slides of moderation rules, Facebook has an excel-style spreadsheet of groups and individuals that have been banned from the platform for being a “hate figure.” Moderators are instructed to remove any content on Facebook that is praising, supporting, or representing any of the listed figures. This blanket-coverage strategy is meant to make moderation simpler for human moderators, but drawing hard lines on content regardless of context can chill political speech or work to maintain a status quo for certain groups in power.  As Max Fisher reports in the New York Times, “In Sri Lanka, Facebook removed posts commemorating members of the Tamil minority who died in the country’s civil war. Facebook bans any positive mention of Tamil rebels, though users can praise government forces who were also guilty of atrocities.”

Facebook, in taking a stance that content can be shared on its platform depending on the context of the post around it, has limited its ability to fully automate certain aspects of content moderation. On Thursday, May 2, 2019, Facebook banned Alex Jones and all InfoWars-related content from its platform with the caveat that content from this publisher could be shared if the commentary about the content is critical of the message. While AI systems have the capability to conduct sentiment analysis human moderation is required to accurately moderate content according to a viewpoint-based policy (Martineau).

Facebook’s use of stringent Powerpoint-delivered rules coupled with the ultimate subjectivity of a human moderator suggests that Facebook would like to combine the best features of context sensitivity and consistency in content moderation. But, as the continued societal outrage against pretty much every content moderation decision Facebook makes suggests, the current model is not successful for them.

Industrial Moderation

Content moderation experts Tarleton Gillespie and Robyn Caplan have narrowed down most content moderation categories into three groups according to size organization and content moderation practices. (1) The artisanal approach is a tactic in which around 5 to 200 workers govern content moderation decisions on a case-by-case basis. Most social media sites begin their moderation with this approach, and then are forced to adapt the process as a case-by-case scale becomes overwhelming. (2) The community-reliant approach, seen on sites like Wikipedia and Reddit, combines formal policy made at the company level with volunteer moderators from the site’s community. (3) Finally, the industrial approach is the model Facebook uses, in which “tens of thousands of workers are employed to enforce rules made by a separate policy team. (Caplan, 2018).

At all levels, content moderation must deal with the tension between context sensitivity and consistency, and accept different trade-offs between the two. Facebook’s industrial approach, as shown in the global reach of its Community Standards, is one that greatly favors consistency over context sensitivity. In her report on online content moderation, Robyn Caplan confirmed Facebook’s approach to consistency over context sensitivity with one of her interviewees from Facebook:

One of our respondents said the goal for these companies is to create a “decision factory,” which resembles more a “Toyota factory than it does a courtroom, in terms of the actual moderation.” Complex concepts like harassment or hate speech are operationalized to make the application of these rules more consistent across the company. He noted the approach as “trying to take a complex thing, and break it into extremely small parts, so that you can routinize doing it over, and over, and over again.”

Thus, Facebook’s eagerness to adopt AI for content moderation makes a lot of sense. One of the largest trade offs organizations make when they choose to automate content moderation is for consistency over context sensitivity.

This eagerness for Facebook to adopt AI is not hidden at all. Mark Zuckerberg publically has high hopes for the role of artificially intelligent automation in content moderation, as he stated in response to questioning about Facebook’s role in allowing harmful content in Myanmar, “Over the long term, building AI tools is going to be the scalable way to identify and root out most of this harmful content” (Simonite). Automation at this scale makes a lot of sense from a logistical standpoint. With around 2.3 billion monthly users, the idea of solely using human moderators is ludicrous. Additionally, Mike Schroepfer, Facebook’s chief technology officer said that he thinks “most people would feel uncomfortable with that” in reference to purely human content moderation. Schroepfer went on to say, “To me AI is the best tool to implement the policy—I actually don’t know what the alternative is” (Simonite).

Conversely, some may feel that absolute automated content moderation is just as unnerving.  Facebook’s former chief security officer, Alex Stamos, warned that increasing the demand for AI content moderation is “a dangerous path,” and that in“five or ten years from now, there could be machine-learning systems that understand human languages as well as humans. We could end up with machine-speed, real-time moderation of everything we say online” (Lichfield).

Creepiness aside, Facebook has already begun to implement AI moderation into its content moderation process in several ways. In all categories of moderated content, Facebook utilizes AI filters to flag content for review by its legion of human moderators. In most of these categories – spam, fake accounts, Adult Nudity and Sexual Activity, Violence and Graphic Content, Child Nudity and Sexual Exploitation, and terrorist propaganda – 95-99.7% of content was actioned upon before other Facebook users reported it. Most of these categories involve clear-cut definitions of what is and is not objectionable, so AI is easily trained to recognize offensive categories. Violence and spam are, generally speaking, globally recognized and not prone to metamorphosing definitions. Training data does not have to include myriad cultural contexts and definitions to pin down definitions of these types of offensive content. Conversely, only 14.9% of Bullying and Harassment was found and actioned upon before Facebook users reported it. The cultural definitions of bullying and harassment change constantly, and different words and even emoji can suddenly morph into offensive slurs and harassment (Community Standards). Because Facebook’s AI content filters require a lot of manual training and labeling, it isn’t yet plausible to produce an AI filter that can respond and adapt to the changing cultural contexts that endeavor to forever evolve the definitions of bullying and harassment. 

Conclusion

Facebook is gigantic. The scale at which the platform operates, within so many countries and with so many stakeholders, requires that the platform moderate certain types of content to keep its users safe and placated within the platform. Using an industrial approach to content moderation, Facebook values consistency in its content moderation over case-by-case considerations for context of the content. This consistency-favoring approach shows in Facebook’s enthusiastic adoption of AI-enhanced content moderation.

 

References

Abbruzzese, J. (2018, October 30). Facebook hits 2.27 billion monthly active users as earnings stabilize. Retrieved May 5, 2019, from NBC News website: https://www.nbcnews.com/tech/tech-news/facebook-hits-2-27-billion-monthly-active-users-earnings-stabilize-n926391

Caplan, R. (2018, November 14). Content or Context Moderation? | Data & Society. Retrieved from https://datasociety.net/output/content-or-context-moderation/

COMMUNICATIONS DECENCY ACT, 47 U.S.C. §230

Community Standards. (n.d.). Retrieved May 5, 2019, from https://www.facebook.com/communitystandards/

Fisher, M. (2018, December 27). Inside Facebook’s Secret Rulebook for Global Political Speech. The New York Times. Retrieved from https://www.nytimes.com/2018/12/27/world/facebook-moderators.html

Gilbery, B. (2018, April 23). Facebook says its users aren’t its product – Business Insider. Retrieved May 5, 2019, from Business Insider website: https://www.businessinsider.com/facebook-advertising-users-as-products-2018-4

Grimmelmann, J. (2018). Internet law: cases and problems(Eigth edition). Lake Oswego, OR: Semaphore Press.

Laidler, J. (2019, March 4). Harvard professor says surveillance capitalism is undermining democracy. Retrieved May 5, 2019, from Harvard Gazette website: https://news.harvard.edu/gazette/story/2019/03/harvard-professor-says-surveillance-capitalism-is-undermining-democracy/

Lichfield, G. (n.d.). Facebook’s leaked moderation rules show why Big Tech can’t police hate speech. Retrieved May 5, 2019, from MIT Technology Review website: https://www.technologyreview.com/f/612690/facebooks-leaked-moderation-rules-show-why-big-tech-cant-police-hate-speech/

Martineau, P. (n.d.). Facebook Bans Alex Jones, Other Extremists—but Not as Planned | WIRED. Retrieved May 5, 2019, from Wired website: https://www.wired.com/story/facebook-bans-alex-jones-extremists/

Mozur, P. (2019, March 4). A Genocide Incited on Facebook, With Posts From Myanmar’s Military. The New York Times. Retrieved from https://www.nytimes.com/2018/10/15/technology/myanmar-facebook-genocide.html

Simonite, T. (n.d.). AI Has Started Cleaning Up Facebook, but Can It Finish? | WIRED. Retrieved May 5, 2019, from Wired website: https://www.wired.com/story/ai-has-started-cleaning-facebook-can-it-finish/

Snider, M. (2018, August 9). Why Facebook can censor Infowars and not break the First Amendment. Retrieved May 5, 2019, from USA Today website: https://www.usatoday.com/story/tech/news/2018/08/09/why-facebook-can-censor-infowars-and-not-break-first-amendment/922636002/