DeBlackboxing Recommendation Algorithms


Warning: Use of undefined constant user_level - assumed 'user_level' (this will throw an Error in a future version of PHP) in /home/commons/public_html/wp-content/plugins/ultimate-google-analytics/ultimate_ga.php on line 524

Abstract

Algorithms are omnipresent in modern life, however, the construction, motivations and information that is used to design algorithms are unknown. Spotify’s Discover Weekly playlist is one of the most popular examples of the satisfaction that can be achieved through a robust recommendation algorithm. Through deblackboxing the recommendation algorithms of three popular sociotechnical artefacts the design principles used to design the algorithms and the affordances and constraints of the algorithms will become clear. The scalability and extensibility of recommendation algorithms will also be discussed in effort to suggest the ways that a current product can benefit from the integration of a recommendation system.

 

Introduction

Within the last five years, consumers have begun to acknowledge the impact that algorithms have on their everyday lives. From online shopping to traffic control, algorithms are ubiquitous, invisible and misunderstood. Algorithms have been designed by major corporations to create curated lifestyles and loyal customers. Society’s engagement with sociotechnical systems has increased the capabilities for corporations to target and sell to consumers.  Spotify has such a refined recommendation algorithm that when users view their Discover Weekly playlists, they claim that Spotify seems to know them better than their spouses (du Sautoy). The seemingly individualistic recommendations are often accepted by users without question of why a particular song is suggested to the user. The breadth of user knowledge and information that is manipulated is misunderstood and even unknown to many, however understanding the design of algorithms may start to change this.

Users are integral to the operations of sociotechnical systems, just like the technologies, artefacts and processes that occur within the system. The rise of applications has led to system innovations that prioritize personalization and individual identity expression (du Sautoy). The relationship between sociotechnical systems and consumers has led to the perception that applications that are designed for sociotechnical systems prioritize individuals (du Sautoy). However, corporations are able to scale their businesses through designing systems that appear to be individually targeted but in reality, use the masses to analyze and predict user behaviors.

The first step to understanding how recommendation algorithms are designed is to understand why they are designed and integrated into technology. The primary and most obvious goal of designing an algorithm is to sell a product or service to a customer, the less obvious goals of recommendation algorithms are designed to maintain user interest in the technology. Specifically, operational goals of recommendation algorithms like recommendation novelty, relevancy, recommendation diversity and recommendation serendipity are integral to keep users engaged with the technology (Aggarwal). For example, if an Amazon shopper purchases a book on Astrophysics once but is constantly inundated with Astrophysics related content and recommendations, he or she may lose interest in the other potential products that Amazon has to offer. Algorithms must be designed to recognize and categorize user behaviors in order to meet the operational goals of the recommendation system. The operational goals integral to user satisfaction and ultimately can lead to consumer loyalty. If the consumer who purchased the Astrophysics book, begun seeing recommendations for chemistry, space or physics books rather than other astrophysics books he or she may see a product that can help in his or her Astrophysics education.  The relevant and diverse recommendations may entice a consumer to purchase another book which indicates to the algorithm that scientific content is of interest to the consumer (Source). Consistent innovations are required for recommendation systems to achieve the operational goals of the system as well as the financial goals of the corporation.

The goals of recommendation systems indicate the kinds of information needed to achieve these goals: user data and product data.  These two forms of data can be managed through the integration of one or many models of recommendation systems. Collaborative filtering is a user centric recommendation system, whereas content-based recommender is a system that relies on key terms and similarities. Hybrid systems use a blend of recommendation models to serve the needs of the users and products in a particular technology (Aggarwal). These systems are built modularly with subsystems designed to assess information inputs. Understanding how other design principles are used in the architecture of recommendation systems can assist in the development of new and innovative recommendation systems in a number of different industries. Recommendation systems are designed into several applications across industries to add another layer of user engagement. The fashion industry, however, does not have a mainstream recommendation system to introduce buyers to new brands or designers like Spotify does with the Discover Weekly playlists. By analyzing the recommendation algorithms of YouTube, Spotify and Amazon, it will become clear how a recommendation system can be used to improve an existing product.

 

YouTube

The vast number of videos and the frequency with which videos are posted is effectively managed through the integration of modular design in YouTube’s recommendation algorithm. This system is organized to mitigate three main constraints of YouTube: scale, freshness and noise (Covington). The content and users are managed through a strategic division of information and the creation of an interconnected structure with a content abstraction layer which manage the number of and features of the videos and a user abstraction layer which manages the demographic and behavioral data of the users.

Figure 1: YouTube Recommendation Algorithm Architecture (Covington)

YouTube’s recommendation algorithm is designed as a two-stage system. The first stage is called a candidate generation network, it is an analysis of a user’s viewing behavior that initiates sorting and retrieval of hundreds of relevant videos. This stage is designed using collaborative filtering and relies on user data such as video watches, search queries and demographics. Candidate generation relies on matrix factorization which trains the algorithms through a rank loss (Covington). The rank loss algorithm is designed to optimize large datasets through precison rankings and ultimately allow the system to select relevant content quickly and use low levels of memory (Weston). Using alternative methods for selecting content from YouTube’s larger video corpus limits the breadth of videos with which recommendations can be made. Prior iterations of the algorithm assessed the larger video corpus using historical viewing data about who made the video and what kind of video it was (Covington). The current algorithm uses more robust data sets that are compared to the behavior similar types of users to narrow down the number of suggested videos from millions to hundreds.

The second stage of the recommendation is the ranking process which analyzes features of the video, user and content creator. This process further narrows the number of videos suggested for the user to view. User profiles are determined by an embedding which is designed to classify each video view at a particular time amongst all of the videos within YouTube’s corpus based on the viewer and the context of the view (Covington). This process is integral to breaking the traditional behavior of recommending videos to users based on past videos, by calculating. The embedding provides implicit feedback for ranking which is used to train the recommendation algorithm. There are explicit feedback systems design into YouTube, however, it requires direct user input which can be sparse.

Figure 2: Formula for Ranking Embedding (Covington)

The affordances of YouTube’s recommendation algorithm are that users who have Google connected accounts benefit from long histories of engagement with YouTube and Google content which is integral to training the recommendation algorithm. These users likely are recommended quality content of interest at higher rates than new users. Another affordance of YouTube’s recommendation algorithm is that it the recommendation algorithm predominantly is trained with implicit feedback like video watch times, likes, comments and subscribing to particular channels. This benefits users for the opposite reasons that explicit feedback on YouTube is minimal, it is limited effort for users, yet it is valuable for users’ recommendations.

Constraints of YouTube’s recommendation algorithm is that auto play may impact the implicit feedback gained from viewing videos. If a user doesn’t turn off auto play when watching videos on YouTube, then multiple videos not of interest may be viewed and shift the pool with which videos are pulled for that user. Another constraint of YouTube’s recommendation algorithm cannot distinct between videos with true or false content. Several incidents over the last few years including recommendations for Hillary Clinton conspiracy videos during the 2016 elections have caused consumers to question the validity of YouTube’s content (Sharma). Part of YouTube’s ranking system is designed to prioritize videos with high traffic in order to keep users on the site, however, many of these videos can be sensationalist (Swearingen).

As stated by Regis Debray, media technologies are important co-dependent mediations which are integral to the spread of artefacts of culture and cultural institutions (Irvine). However, the relationship between the artefact and culture becomes unbalanced as the artefacts are designed to filter culture for consumers. This problem is relevant to YouTube’s algorithm because it narrows down millions of videos to dozens without factoring in the validity of the content that is being recommended to users. This issue is unique to sociotechnical artefacts that host media because it has the power to sway the ideologies and actions of viewers. In other industries like ecommerce and music, the reliability of the recommendation is less impactful. If a consumer orders a product that was misrepresented on the ecommerce application, then he or she can just return the item. Correcting the spread of misinformation is not difficult, it just needs concerted effort to mitigate the spread.

Spotify:

Spotify’s recommendation algorithm for the discover weekly playlist relies on the design of three recommendation models: collaborative filtering, natural language processing and audio analysis (Galvanize). Through the acquisition of Echo Nest, a Boston based start-up company, Spotify’s algorithm was able to be advanced through acoustic analysis which allows music on the application to be classified based on several aural factors. Echo Nest is also designed to crawl the internet for music related digital media in order to find actionable and quantifiable data for Spotify’s recommendation algorithm (Prey). The design of Echo Nest relies upon distributed cognition across social media posts, blogs, and music reviews, as well as, natural language processing to identify key words and phrases which allow for the derivation of similarities between songs. Distributed cognition in Echo Nest also enables collaborative filtering because the identification of key words and phrases indicates similarities in cognitive processes (Holland).  The design of Echo Nest enables the semantic, tempo and even danceability analysis of the songs within Spotify’s corpus. This in-depth assessment is integral to the distinction between rock and Christian rock, as well as other genres of music which may have similar tempos and structures but vastly different content and listeners (Prey).

User data gathered through Echo Nest is managed through a tool called the Taste Profile, which tracks a user’s interaction with the content on the application. The Taste Profile is a content filtering module within Spotify’s recommendation algorithm (Prey). User data within Spotify is generated through implicit feedback such as the number of times a user listened to a song and the actions a user took while listening or after listening to a song. Explicit feedback is also recorded based on user’s behaviors such as skipping songs and clicking the thumbs down (Pasick).

Figure 3: Example of a Taste Profile (Pasick)

 

Spotify’s recommendation system is designed to narrow down potential content of interest for users through ranking songs and playlists. Spotify’s ranking system prioritizes songs and playlists with high numbers of followers and Spotify Generated playlists (Prey). The actual calculation of the songs recommended to users is done through matrix factorization with Python libraries. This process results in two vectors, one for user information(x) and one for songs (y) which are compared using collaborative filtering (Ciocca). Deep learning is also integral to the identification of patterns in user behavior across the platform. It identifies patterns across users which helps to make the music selections more specific and feel personalized (Pasick).

Figure 4.: Spotify Matrix Factorization Equation


Figure 5.: Spotify Vectors

 

Seventy-five million Spotify listeners benefit positively from the recommendation system and the playlists which are produced because of it (Pasick). This positive feedback is heavily due to the cultural affordances designed into the system, specifically the pattern recognition, community sourcing of content and the perceived specificity. Discover Weekly is a unique product because it mediates the social behavior of recommendations. The specificity and accuracy with which the playlist is curated is designed to feel as if the application knows the user personally. The Discover Weekly playlist benefits from participatory affordances that are embedded within the culture of listening to music. The spread of digital music files in the late 1990’s has turned music into a participatory activity, even if a user is listening to music alone (Murray).

However, there are some constraints to the system that are due to the interface of Spotify’s application. One major constraint is the ability for users to find the Discover Weekly Playlist. Though there over one hundred million active Spotify users, only about half of them use Discover Weekly (Aswald). As a long time, Spotify user, I was unaware of the recommendation system until very recently, despite its debut in 2015 (Ciocca). This constraint of visibility and usability is not of the algorithm, but of the interface of the application overall. This wildly popular feature should be more obviously displayed when a user opens his or her application. Augmenting the interface of Spotify to highlight Discover Weekly would likely increase the time spent within the application and further satisfy the goal of designing the algorithm. Another usability constraint of the algorithm is that users are unable to easily save the playlist each week. For a product that is so efficiently and effectively designed, it is strange that the playlist is replaced each week without a simple option to save it.

Amazon

Amazon was one of the first commercial companies to pioneer a recommendation system. Amazon’s original recommendation system was designed around inputs of buying behaviors, explicit feedback from ratings and browsing behaviors (Aggarwal). Over time, the algorithm was redesigned to assess both the user’s previously purchased and rated items, as well as features of the items themselves (Martinez).

The designers at Amazon call the system item to item collaborative filtering, which is different from traditional collaborative filtering because the algorithm prioritizes items that are likely to be purchased in tandem. The algorithm is designed to determine likeness between products and is structured to identify the non-uniform distribution of customer purchase histories and the probability of future purchases. The designers of the item to item collaborative filtering system determined that randomly assessing purchased items by customers would bias the recommendations because of atypical behaviors. A heavy buyer’s purchasing history is more likely to be selected as representative of other purchasing probabilities, but a heavy buyer’s engagement with Amazon products is likely not representative of all users. The designers mitigate this problem in the algorithm by modelling customers, denoted by c in the equation, who have purchased product x multiple opportunities to buy product y. The probability of customer c purchasing product y is determined by the number of non-product x purchases made or the probability that any random purchase may be product y. The next layer of the formula determines the expected number of product y customers among product x customers. This enables further comparison of the expected customers who purchased both x and y and the observed number of customers who purchased x and y(Smith).

Figure 6. Related Items Calculation Source: Two Decades of Recommender Systems

 

The defining related items process is completed offline which enables the identification of similar products to be done quickly and with little memory storage required. Item to item collaborative filtering also does not have the constraint of the cold start problem, which is evident in collaborative filtering models; a user initiates item to item collaborative filtering by purchasing or browsing a product which indicates interest (Linden).

Time is an integral factor to the quality of recommendations given by the algorithm. It impacts the relevancy of the products which are suggested to the user. Designing into the recommendation algorithm a feature to not only promote new products which have not been bought and complementary products, like a camera and memory card, keep customers interested in the product offerings on the site. Adapting the algorithm to recommend new products is an example of a cold start problem, which essentially is a lack of information about a product or customer (Smith). This issue is not exclusive to Amazon and is present in many recommendation algorithms. The lack of information is an inherent constraint of recommendation algorithms, however, all three algorithms discussed have methods to mitigate it.

Amazon’s recommendation algorithm is representative of a change in cultural values, especially as it relates to new products and timing. Murray indicates the need for designing for core human needs especially at the start of the design process by identifying the function, context and core of the designed artefact (Murray). For Amazon, timing is integral to the context of their website and product offerings. Seasonality of items like Christmas tree decorations and beachwear must be first acknowledged by the designers of the algorithms then subverted through the creation of an effective system architecture.

The encyclopedic nature of Amazon is both an affordance and constraint. Users are attracted to the site because of the breadth of products and goods available for purchase. However, the corpus of products includes poor quality products that may not meet the user’s expectations. This is a constraint of the recommendation algorithm because returned products were once purchased by a user, which is recorded as a purchase by the algorithm.  This disrupts the accuracy and precision of the recommendation because a user may read negative reviews of a suggested product and see the negative experiences of other customers. Acknowledging this would likely cause a user to forgo the recommended product. Another constraint of Amazon’s recommendation algorithm is that the initial operational goal of the algorithm was to mediate the temptation products at the registers of brick and mortar stores. The designers did not initially intend to design an algorithm to find related products for prospective consumers. The incongruence between the original goal of the algorithm and current goal of the algorithm result in dissonance that may impact the design and advancements of the algorithm. Amazon’s algorithm does not seem to be as modular as the algorithms of Spotify or YouTube, which is likely because of the preexisting system architecture with a different goal that has been adapted to meet the new goal of the algorithm.

Scalability of Recommendation Algorithms

All three algorithms analyzed focused on two major types of data, customer data and product data. Each recommendation algorithm was designed to manage the industry specific data and behaviors which makes any one of the algorithms difficult to scale across industries without modification. Scalability and extensibility can be designed into one or a combination of the recommendation algorithms of YouTube, Spotify or Amazon to accommodate the affordances and constraints of the fashion industry. Scalability would increase the amount and types of data that could be assessed and managed through the algorithm. Extensibility impacts the system architecture to include more forms of data and layers of information (Irvine).

The women’s fashion industry does not presently have a consumer facing sociotechnical artefacts or application that recommends products or popular items. There are fashion subscription boxes like Stitchfix, Trunk Club and Fabletics which give users a simple test to predict their style and sends a monthly box of items that the consumer may be interested in. These quizzes are very simple and gather information from consumers in less than twenty questions. The data analysis done for the fashion subscription boxes are far less than the recommendation algorithms of YouTube, Spotify and Amazon. The lack of investment in data analysis of these subscription boxes may be due to the goals of the boxes. A subscription model ensures that the companies behind the programs are financially profitable. The companies also have strategic partnerships with the brands that they recommend which makes it even more financially attractive for these companies to maintain their current strategy.

If a subscription box company wanted to integrate a deep learning recommendation algorithm into their product, an algorithm similar to Amazon’s item to item collaborative filtering and Spotify’s mapping technique to create taste profiles. The item to item collaborative filtering is an obvious choice for an ecommerce application, however integrating the community sourcing of recommendations, like Spotify, would be a great improvement to the item to item collaborative filtering. This strategy would mitigate the constraint of Amazon’s current algorithm regarding product quality. Community sourcing recommendations would help to ensure that customers with similar style aesthetics would be recommended items that similar users have bought and liked. The cold start problem would still be a constraint of the algorithm, but until there are wide spread advancements in recommendation algorithms there does not seem to be a way to mitigate this issue.

 

Sources:

1.     Adamides, Emmanuel. (2018). Activity-based analysis of socio-technical systems innovations.
2.     Aggarwal, C. C. (2016). Recommender systems. Cham: Springer International Publishing.
3.     Aswad, J., & Aswad, J. (2018, March 26). Spotify Projects Slower Growth, 90 Million-Plus Subscribers by End of 2018. Retrieved December 15, 2018, from https://variety.com/2018/digital/news/spotify-projects-slower-growth-90-million-plus-subscribers-by-end-of-2018-1202736163/
4.     Ciocca, S. (2017, October 10). How Does Spotify Know You So Well? – Member Feature Stories. Retrieved December 15, 2018, from https://medium.com/s/story/spotifys-discover-weekly-how-machine-learning-finds-your-new-music-19a41ab76efe
5.     Covington, P., Adams, J., & Sargin, E. (2016, September). Deep neural networks for YouTube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems (pp. 191-198). ACM.
6.     Ever Wonder How Spotify Discover Weekly Works? Data Science. (2016, August 22). Retrieved December 15, 2018, from https://blog.galvanize.com/spotify-discover-weekly-data-science/
7.     Hollan, J., Hutchins, E., & Kirsh, D. (2000). Distributed cognition: toward a new foundation for human-computer interaction research. ACM Transactions on Computer-Human Interaction (TOCHI), 7(2), 174-196.
8.     How do algorithms run my life? (n.d.). Retrieved December 15, 2018, from http://www.bbc.co.uk/guides/z3sg9qt
9.     Irvine, M. “Introduction to Affordances and Interfaces.”
10.  Irvine, M., Introduction to Modularity and Abstraction layers
11.  Irvine, M., “Understanding Sociotechnical Systems with Mediology and Actor Network Theory (with a De-Blackboxing Method)”
12.  Irvine, M., Introduction to Design Thinking: Systems and Architectures
13.  Linden, G.D., Jacobi, J.A. and Benson, E.A., Collaborative Recommendations Using Item-to-Item Similarity Mappings, US Patent 6,266,649, to Amazon.com, Patent and Trademark Office, 2001 (filed 1998).
14.  Martinez, M., Amazon: Everything you wanted to know about its algorithm and innovation. (2017, September 27). Retrieved December 15, 2018, from https://publications.computer.org/internet-computing/2017/09/27/amazon-all-the-research-you-need-about-its-algorithm-and-innovation/

15.  Murray, J., Inventing the Medium: Principles of Interaction Design as a Cultural Practice. Cambridge, MA: MIT Press, 2012.

16.  Pasick, A. (n.d.). The magic that makes Spotify’s Discover Weekly playlists so damn good. Retrieved December 15, 2018, from https://qz.com/571007/the-magic-that-makes-spotifys-discover-weekly-playlists-so-damn-good/
17.  Perspective | How Silicon Valley is erasing your individuality. (n.d.). Retrieved December 15, 2018, from https://www.washingtonpost.com/outlook/how-silicon-valley-is-erasing-your-individuality/2017/09/08/a100010a-937c-11e7-aace-04b862b2b3f3_story.html
18.  Sharma, A. (2018, March 8). Is Youtube’s Recommendation Algorithm Really Working? Retrieved December 15, 2018, from https://www.analyticsindiamag.com/is-youtubes-recommendation-algorithm-really-working/
19.  Smith, B., & Linden, G. (2017). Two decades of recommender systems at Amazon. com. Ieee internet computing, 21(3), 12-18.
20.  Swearingen, J. (2018, February 7). YouTube’s Algorithm Wants You to Watch Conspiracy-Mongering Trash. Retrieved December 15, 2018, from http://nymag.com/intelligencer/2018/02/youtubes-recommendation-algorithm-favors-conspiracy-videos.html
21.  Weston, J., Bengio, S., and Usunier, N. Wsabie: Scaling up to large vocabulary image annotation. In Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI, 2011

22.  Zhang J. Patel, V. “Distributed Cognition, Representation, and Affordance.” Pragmatics & Cognition 14, no. 2 (July 2006): 333-341.