Author Archives: Shahin Rafikian

Machine Learning & Algorithmic Music Composition

Abstract

Machine learning and algorithmic systems has not been a foreign application process in the field of music composition. Researchers, musicians, and aspiring artists have used algorithmic music composition as a tool for music production for many years now, and as technology advances, so do the understandings of the art that algorithms output and the implications that come along with it. Who owns the art? Is it creative? This research paper explores the way machine learning and algorithms are used to implement and utilize music compositional systems, as well as the discourse that both exists and will exist in the coming years due to the accessibility of this technology. Case studies will be examined to narrate the support and disapproval of algorithms for music composition, such as Magenta’s NSynth system and Amper’s system.

Introduction

The process and study of algorithmic music composition has been around for centuries, for algorithmic music is understood as a set of rules or procedures followed to put together a piece of music (Simoni). These algorithms can be simple or complicated — they are meant to be manually predictive styles of music composition. However, fairly new research regarding the computational, algorithmic machine learned process of music creation has been prevalent. How is machine learning applied to the field of music production? For the sake of this research paper, any concepts regarding music theory and styles of music/music genres will be avoided for our discourse primarily because the discourse being expanded upon is algorithmic music composition in relation to technology. Algorithmic composition is made up of methodical procedures through computer processing, which has made algorithms in musical contexts more sophisticated and complex (“Algorithms in Music”).

Technical Overview

Though the functionality of algorithmic music composition systems differ based on the utility of the technology (e.g. a tool for creation v. a system that generates a piece at random), they all share the same internal inner system. Machine learning is defined as a set of techniques and algorithms that carry out certain tasks while being housed inside of the artificial intelligence it’s designed to be in. Machine learning researchers are primarily interested in understanding the knowledge about data-driven algorithms. In relation to technological-algorithmic music composition, defined as the creation of methodological procedures (“Algorithms in Music”), data is collected from hundreds of types of music and coded into different categories for a more organized and automated flow of data retrieval on the machines end once an input is requested for the machine to output content. This process can be identified as classification, a type of algorithm that classifies features of data into categories. If a particular algorithm were to be implemented for an artist to use technology as a tool to create a machine-generated melody, the algorithm begins to look through the scanned and classified data it has on melodies and begins to produce a melody that not only borrows sonic elements from the data is has, but also is sonically representative as a combination of the data is has learned.

The process of collecting data is called a generative adversarial network, which is a deep neural network made up of two other networks that feed off of one another (“A Beginner’s Guide to GANs”). One network is called the generator which generates the new data, while the other network is called the discriminator (part of the discriminative algorithm) which evaluates an input for authenticity (“A Beginner’s Guide to GANs”). The generating network begins to produce an requested content (in this case, a musical component) at random, in which, soon enough, the discriminator network begins to feed data into the generating network by critiquing what’s being produced (“A Beginner’s Guide to GANs”). From there, the generating network fine tunes what is being generated until the discriminator network lessens the amount of critiques it feeds, which suggests that the generating network has produced something well-bodied enough for the discriminating network to identify it as a creative work of art (“A Beginner’s Guide to GANs”).

 

Functionality and Utility

Wave animation

Visual of raw audio data and the hundreds of components that make up each sonic component within a 1 second audio file (“WaveNet”)

As an application to automated music production, there has been heavy discourse regarding GANs and the approachability of GANs in research specifically in regards to audio, in comparison to, more commonly, digital images (“The NSynth Dataset”). Audio signals are harder to code and classify based on the properties that make up sound — previous efforts to synthesize data-driven audio was limited by subjective forms of categorization such as textural sonic make-up or training small parametric models (“The NSynth Dataset”). Researchers at Magenta have partnered with Google to create NSynth, an open-source audio data set that contains over 300,000 musical notes, each of which are representative of different pitches, timbres, and frequencies (“The NSynth Dataset”). The creation of NSynth was Magenta’s attempt at making audio dataset retrieval (GAN’s) as approachable and accessible as possible, without the technical limitations (“The NSynth Dataset”). By having this technology more accessible, the developers at Magenta were also developing news ways for humans to use technology as a tool for human expression (“NSynth”). NSynth uses deep neural networks to create sounds as authentic and original as human-synthesized sounds by mimicking a WaveNet expressive model — a deep learning generative model of raw audio waveforms which generates sound (speech or music) that mimics the original source of sound. WaveNets work as a convolutional neural network where the input goes through various hidden layers to generate an output as close as possible to the input (“WaveNet”).

Architecture animation

Demonstration of WaveNet system of inputting and outputting media, (“WaveNet”)

Magenta considers their inspired-WaveNet as a compression of original data, whose output is as similar as the input (“NSynth”). Below are images provided by Magenta that demonstrates the process of inputted audio becoming coded, classified, and compressed, and then outputted as a reconstructed sound that resembles the original input:

The process of GAN’s at work to reproduce an inputted sound, (“NSynth”)

Here is a clip of the original bass audio (“NSynth”).

In this audio clip, the original bass audio is embedded, compressed, and then reconstructed as the following output (“NSynth”).

Algorithmic music composition can be used in various ways, similarly to how other forms of art can be produced by technology depending on the implementation of the technology for content creation. Some algorithmic music composition systems can be used as a tool to create sounds generated from a pool of data that reflects other sounds, similarly to the way way Magenta’s NSynth works, while other algorithmic music composition function as stand-alone systems synthesized to output an entire generated track. Researchers at Sony used Flow Machines software, which houses 13,000 pieces of musical track data, to create a song that mimics the work of The Beatles (Vincent). The track “Daddy’s Car” (link below), was fully produced by composer Benoit Carre who inputted the desired style of music and created the lyrics to the track (Vincent). Though multi-layered and complex, Sony’s experiment holds a revolutionary standard for the better and worse of algorithmic music composition. The bettering of it would be that algorithmic music composition can be demonstrated as a machine-capable music creator. However, legal questions come into play regarding rights/copyright and artistic autonomy. The ability for AI to generate a track from scratch simply by requesting a particular type of genre brings in various types of data that the machine has learned threatens the sanctity of artists losing rights to their song, let alone being unable to determine the origins of the borrowed/influenced musical components of the newly composed AI track.

In regards to the generative adversarial networks that algorithmic music compositional systems run off of, there comes a point in time where the data being collected in the machine learning and classification process is only representative of music that’s similar to one another. In other words, a computer scientist can generate an algorithm for AI to produce a song, but only feed that AI musical data from 1,000 songs all of which are sonically similar. Even if 1,000 songs are randomly selected for classification and deep learning, and all those songs are fairly familiar in genre, whatever song is machine-produced would only reflect what the generator of the GAN’s thinks is music — and that notion of what “is” music would be based off of what the network commonly notes as the structure of music. More avant-garde tracks won’t be representative of what algorithmic music composition systems are capable of. With this understanding, the general ‘standard’ for what music really is, based on how the generator in the GAN functions, will be set and will produce tracks that might be overall similar to one another. This is evident not only by how algorithmic music composition systems are too generic and lack substance (Deahl), but also based on how more avant-garde/non-generic tracks are more creative and multi-dimensional in terms of the quality of art and the meaning behind it.

Ethical Considerations

Many ethical considerations surround the discourse of creativity and artificial intelligence. Art is and has been considered a personable and human-to-human experience where one artist creates a body of work to express an idea or emotion to their audience. From there, the art begins to find its place and exist in a broader cultural context where it provides meaning and metaphorical reflection. This is creativity — defined as implementing original ideas based on one’s imagination. The term “imagination” generally denotes to the human experience and the human mind. Once technology is brought into the mix of artistic creation and creativity, regardless of whether technology is used as a tool for the artist or used as a stand-alone machine that generates content through the use of algorithms, cultural discussions outside of the realm of computer science can begin to threaten and question whether technology-influenced art is “creative” or “art.” Much discussion around the world of AI and machines already involves misunderstandings and misconceptions of those technologies, where a narrative of AI being autonomous beings that will take over the human race is portrayed in the media from people that are unaware of the development process of AI and how it functions. Though the threat of automation does exist, people fail to understand that AI and machines are not autonomous. They are not self-thinking humans/beings. The actions performed by machines are a product of human development and language/actions that are coded into algorithms as a reflection of the human experience. Perhaps it’s that people are afraid to approach the fundamental aspects of artificial intelligence, or that de-blackboxing artificial intelligence is unappealing. However, in the past 20+ years, our knowledge of natural language processors, machine learning, and artificial intelligence as expanded at such an exponential rate that it seems as though we’ve come to an intersectional point in time where society is now trying to catch up to speed with the development and growth of technology.

Computer generated music is a product of humans, not the machines themselves because machines are not autonomous self-thinking beings. Through generative adversarial networks, artificially intelligent machines learn to understand and classify the data so that once an input is given from a human, they can produce/reproduce whatever action is being desired. However, if the machine isn’t responsible for creating the music, who is? Computer scientists are not the ones actually producing the data that is being fed as algorithms to the computer – they merely create the algorithms. Nor is anyone able to cite music that a machine generates through the use of algorithms because what’s generated is an inspired mix of data from other artists work – hundreds and hundreds of them. The criteria for creativity and aesthetics, especially in the art world, is subjective to the artist and audience (Simoni). Earlier system models of generative algorithmic compositions are outnumbered by later and more recent system models of generative algorithmic compositions due to how these systems are dulling the creative process (Simoni). The dulling of creativity is an ethical debate surrounding the art realm, where people are not only fearful that machine-composed art will oversaturate the art realm, but also lower the expectations of creativity itself. Already, when reviewing AI generated bodies of work, if told that the body of work is AI produced, art critics highly criticize it calling it one-dimensional, boring, and unimaginative as if it were a knock off of already existing artists (GumGum Insights). Projects such as GumGum’s self-painting machine sparked a lot of controversy over how creating an image through a collection of data in the GAN can be considered “creative,” suggesting that it’s not at all creative because there was no artist directly involved in the creative process (“ART.IFICIAL”). Another consideration was the lack of source credibility with citing where the artistic inspiration is from. It is not only hard to determine what work(s) influence the outputted creative work of a machine, but also hard to determine how much of whose work was an influential source to create the GAN-generated art. The same arguments can be made in relation to algorithmic musical composition. Depending on the algorithm being implemented to produce musical content, it’s evident that there is a pool of data collected and classified for the generative adversarial network to work off from the data it has to create music. Technically speaking, the collected data is visible and citable, however the output of algorithmic music composition is not entirely traceable back to its original source. Music production computer softwares like Logic are already both readily accessible to consumers and allows for producers to generate auto-populated drum patterns that are unique to each and every user, thanks to artificial intelligence’s deep neural networks which relies on large amounts of data to output what’s being requested by the user (Deahl).

For the many arguments made against using algorithmic composition, there are many arguments made in support of it. Many advocates urge to use algorithmic music composition programs as tools to both enable artists to create more and make music composition more accessible for non-musicians (Deahl). Amper, an AI algorithm music composition tool that generates music, is easier than Google’s NSynth machine and allows for music to be automatically generated in less than three manual commands through a generative adversarial network that creates a unique sound each and every time (Deahl). Many artists such as Taryn Southern use tools like this to produce meaningful music rather than to harm the art industry by letting machine powered algorithms draw in musical inspirations from its data to produce a unique track (Deahl). Southern’s practices for music production ties into the discourse surrounding remix practices — what about a remix is original or stolen? At what point does a publisher of an art piece have to cite/source the original content it’s drawing inspiration from? In regards to algorithmic music composition, should we and how should we cite from the inspired sources? Should there be a way for AI-generated music to be identifiable? With quick developments in technology, there is a possibility that a new standard of music production will be created in regards to the awareness (or lack thereof) of algorithmic music composition.

Conclusion

Algorithmic music composition is a tool that has been around for quite some time. It’s the use of algorithmic music composition systems, as a tool or as a stand alone music production machine, that are rapidly evolving in time. These systems primarily function as general adversarial networks in which a gathered amount of data is learned as a natural language processor for the system. From there, once an input is requested (e.g. “produce rock music”), the system identifies previously classified sounds coded as “rock music,” compresses it, and the outputs a sound that borrows from the copious data it has learned and stored. Current research on the use of algorithmic composed music demonstrates that there are positives and negatives to such systems — it allows for algorithm music composition to become a tool for creative expansion and accessibility for aspiring artists, however it also hinders creative development by limiting its source credibility and sonic uniqueness. Machine learning and algorithmic computational systems are embedded in the process of algorithmic music composition, however the ongoing debate on whether the work it produces is creative or not will remain a subjective debate until legal precautions are carried out to bring clarity to who owns AI-influenced music. 

Work Cited:

“A Beginner’s Guide to Generative Adversarial Networks (GANs).” Skymind, https://skymind.ai/wiki/generative-adversarial-network-gan.

“ALGORITHMS IN MUSIC.” NorthWest Academic Computing Consortiumhttp://musicalgorithms.ewu.edu/musichist.html.

“ART.IFICIAL: How Artificial Intelligence Is Paving the Way for the Future of Creativity.” Gumgumhttps://gumgum.com/artificial-creativity.

Deahl, Dani. “HOW AI-GENERATED MUSIC IS CHANGING THE WAY HITS ARE MADE.” The Verge, 31 Aug. 2018, https://www.theverge.com/2018/8/31/17777008/artificial-intelligence-taryn-southern-amper-music.

“NSynth: Neural Audio Synthesis.” Magenta, 6 Apr. 2017, https://magenta.tensorflow.org/nsynth.

Simoni, Mary. “Chapter 2: The History and Philosophy of Algorithmic Composition.” Algorithmic Composition: A Gentle Introduction to Music Composition Using Common LISP and Common Music, MI: Michigan Publishing, 2003, https://quod.lib.umich.edu/s/spobooks/bbv9810.0001.001/1:5/–algorithmic-composition-a-gentle-introduction-to-music?rgn=div1;view=fulltext.

“The NSynth Dataset.” Magenta, 5 Apr. 2017, https://magenta.tensorflow.org/datasets/nsynth.

Vincent, James. “This AI-Written Pop Song Is Almost Certainly a Dire Warning for Humanity.” The Verge, 26 Sept. 2016, https://www.theverge.com/2016/9/26/13055938/ai-pop-song-daddys-car-sony.

“WaveNet: A Generative Model for Raw Audio.” DeepMindhttps://deepmind.com/blog/wavenet-generative-model-raw-audio/.

Machine Learning in the Music Industry

Is there ever a point in time where computer scientists are halted by the development of their research and research studies due to negative cultural implications that may come from the research if it were to be completed? As research towards machine learning, artificial intelligence, and ‘big data’ further broaden our understandings of these terms and the capabilities society has to compute knowledge and behavior(s) into algorithms, there comes a point in time where more obscure and niche topics become challenging to narrow down into blatant terms for machines to accurately understand. We’re used to understanding these computer science terms, primarily, in regards to machine-to-human (or vise versa) applications – after all, we are designing these AI/ML programs to enrich the lives of humans. However, a fairly new and specific field of research reflects the field of art, particularly with music.

Though some factual statements can be said regarding what genre is what, music theory notions, and the overall composition of music itself, it’s difficult for computer scientists to make an accurate translation through natural language processing’s for machines to objectively determine categorical labels for music based on the objective factors of music. To date, certain genres of music like alternative rock and pop have evolved into more broad and avant guard sonic stylings that new sub-categories have appeared. And, to date, many computer scientists have developed machine learning AI that can identify music as broad as genres and as narrow as a specific artist. Support towards machine learning AI that identifies genres based on music sounds argue that the deep learning systems of these machines perform better than “conventional software human-coded” algorithms, which removes any potential prejudice or bias (Fogel, Engadget). The music within each genre that is being coded into a NLP for AI is, objectively, a series of notes at various frequencies — this is true, and this is what makes up the genre it’s in. However, what if there’s a misinterpretation of the original genre classification upon being published? Through trial and error, the algorithms coded for machines to identify a particular genre of sound or an artists sound can be perfected, however it’s the potential issue of human-coding error and algorithm-computational error that can cause inaccurate music identification.

The machine learning application processes being applied to AI in the field of music-interaction has also shown to have benefits from that research to apply to future fields of research outside of music. Less than a year ago, computer scientists were able to further improve on research from 2014 where deep-learning algorithms can analyze brain activity and electrodes to digitally synthesize speech and identify a song that is playing in someones head (Dormehl, Digital Trends). A machine learning model of the neural representation of imagined sound was used to predict what sound is being thought of in real time, which is later translated into identifying the note being thought of (and its frequency) with the coded frequency of a note that’s been coded as a NLP (Dormehl, Digital Trends). From this research, it was determined that these machine learning algorithms could be applied for future development on speech prosthetic devices to restore communication for paralyzed individuals who are unable to speak (Dormehl, Digital Trends).

https://www.engadget.com/2017/05/04/machine-learning-ai-identifies-music-genres/

https://www.digitaltrends.com/cool-tech/ai-algorithm-identifies-music-in-head/

How “Big” is Big Data?

Similarly to how the media portrays machines in a humanely way, there is plentiful social discourse on the term “big data” when trying to further sell a hyperrealistic future that’s fully immersed in advanced technology. “Big Data” can be used to scare away the technologically unaware from growing fearful of their technology. This, however, is simply not the case, for big data is representative of the complex and multi-layered bodies of information from one point to another.

A dynamic shift between researchers and the development of technology has taken place in which researchers now find themselves trying to quickly write up and conceptualize algorithms and methods to represent the multidimensional systems that operate the technologically advanced world we live in today (Johnson, Denning, et al.). This is why data scientists are vital to the field of technology so that they can develop formulations at a greater speed. Big data can be seen as an older sibling to data itself (mainly because it is), where it hosts and transports large and different bodies of information from one end of a server to another. It’s inevitable presence and use in technology is a great contribution to the fields of natural language processing and machine learning, where the data collected can be coded and formulated to then become synthesized for machines to learn.

The biggest problem with big data is that due to the size of what “big data” really is, it’s at risk of causing severe negative implications to a technological ecosystem. Regardless of what big data encompasses (IoT’s, IT, the cloud), it’s inevitable that big data is ‘big data’ because of how much faith there has been to not only provide so much information, but also allow or there to be one housing unit for that information.

Jeffrey Johnson, Peter Denning, et al., Big Data, Digitization, and Social Change (Opening Statement), Ubiquity 2017, (December 2017).

One Cloud Infrastructure To Rule Them All…?

It has become thematic through my graduate school experience to consider the circumstances in which technology fails to service us. In the past 20+ years, our knowledge on NLP’s, ML, AI and computing has expanded greatly as such an exponential speed that it seems as though we’ve come to an intersectional point in time where now we as humans are attempting to catch up to speed with the growth of technology. There are three models of cloud computing; Software as a service (Saas), Platform as a service (Paas), and Infrastructure as a service (Iaas). EAch of these cloud computing types function for different output results, depending on the type you, a user, are working with. Entertaining the thought of what would happen if we were to devote all of our data under one unifying architectural system design, regardless of which “big four” company it would be, would result in an extremely fair number of positives and negatives.

Having a universal cloud database would solve accessibility issues, therefore removing the dichotomy between the experience of internal and external communication use amongst different cloud computing platforms. Specifically with Saas, this cheaper alternative model to cloud computer would become both accessible for everybody to use and instantaneous to use. This alone, however, can have its drawback by creating a communication technology culture that insinuates all cloud-users are expected to be readily available at all times. Connectivity issues are also deemed as an issue across all cloud computing models, from the restriction of performance based on the strength of internet connection, to the overall security and network support of the cloud computing software itself. The benefit to this cost would be the increase in productivity. At the level of Saas, users are provided a helpful collaboration space to work in within the cloud ecosystem, Paas users are able to rapidly create content at an efficient cost, and Laas users are provided more flexibility with the work they do within the infrastructure the cloud provides. But are these benefits to user productivity really worth the cost of risking data breaches and communication technology deficiencies? What if new, unknown compliance restrictions come about, or users begin to limit their range of cognitive thinking beyond the predisposed cloud infrastructure they are working in?

I feel obligated to play devils advocate and say that, despite all the possible negatives, they are just that. Possible. One great example of a shared ecosystem, though not cloud based, is Apple. Apple has created a communication technology environment that synchronizes user data across all Apple products. That’s what makes Apple users so loyal to the brand. Therefore, I believe that if there were to be one shared cloud infrastructure, we would benefit more from it than being hurt.

Misunderstandings Surrounding the World of AI: Who’s To Blame?

Deciphering the difference between the creation and the creator is what’s most concerning. When doing so, we can then begin to learn and de-blackbox what a particular type of artificial intelligence is attempting to achieve, and why it is achieving it.

For those who are not tech savvy or technologically aware of the systems that are in place for the technologies we use every day, it’s easy to assume our technology to have a mind of its own. As people begin to do so, there is not only a disassociation towards the developers of the artificial intelligence created, but also a lack of drive to understand how and why the artificial intelligence we use has grown accustom to the machine learned practices it portrays. In other words, those that don’t know about technology don’t care to know why machines and artificial intelligence does what it does.

This might not seem like too much of an ethical concern at first. However, I do believe that the lack of knowledge surrounding the development of artificial intelligence is what leads to the hysteria that surrounds the tech industry. “Robots Will Take Over The Human Race,” “Artificial Intelligence Will Take All Our Jobs,” “What Is AI Really Thinking?” Lack of knowledge towards artificial intelligence makes artificial intelligence more prone to being portrayed as the “bad guy,” when in reality artificial intelligence has no autonomy. Programmed by developers, software engineers, machine learning experts (the list can go one for how many different types of people can contribute to the development of artificial intelligence related projects), artificial intelligence is just that — artificial. It’s important to be mindful that whatever a particular artificial intelligence system is capable of doing, it was programmed to do so by developers. With extensive research and carefully calculated algorithms, artificial intelligence can continue to resemble, closer and closer, to the human mind. That’s the ‘intelligent’ aspect.

Can artificial intelligence take our jobs? Can they take over the world? Is all the hysteria true? The short answer is maybe. Maybe, contingent on what software developers program a particular artificial intelligent software to do.

This interview between future electro-pop star Sophie and Sophia the Robot demonstrates demonstrates that even the realest of interactions can be lost within the promoted idea that robots and artificial intelligence have autonomy. Throughout the interview, Sophia expresses to Sophie that she doesn’t have legs, longs to swim in the ocean, and “believes [society] should be teaching AI to be creative, just as humans do for their children.” The active choice to program such thoughtful, empathetic ideologies is extremely unethical and further emphasizes on the misinterpretation and misunderstandings that surround the artificial intelligence world.

Hey Google… What Are You?

When deblackboxing a speech-activated virtual assistant application like Google Home, you begin to see some parallels between that and other virtual assistant applications (like Siri). Using a mix of structured and unstructured data, Google Home’s machine learning processes takes note of the information we provide to it, and through machine learning/convolution neural networks, Google Home begins to accommodate and adapt to the primary user of the virtual assistant.

The structured data can come from direct sources of information – Google Home has a functionality where users are able to use typed input for commands and visual responses (Google Assistant, Wikipedia), which can constitute as direct data. Additionally, the information Google Home collects through direct verbal actions are direct forms of data which would then be logged for both machine learning purposes and future predictive interactions on behalf of Google Home. In regards to unstructured data, Google Home surely collects data from indirect forms of communication that the user conducts in with any account linked to the Google Home. This could mean your email, texts, contacts, Spotify, YouTube… essentially any device or application that you link with your Google Assistant (Google Assistant, Google). Based on the patents for intelligent automated assistant, the two inputs – user input and other events/facts – supports the direct and indirect, structured and unstructured data inputs that Google Home both listens too and records information on. From there, the virtual assistant application begins to break down the requested input/command and breaks it up into groups to determine what is being said, what needs to be done (in the most efficient matter based on action patterns), how it will be done, and what will be said (Intelligent Automated Assistant, Google Patents). Once the virtual assistant determines all of that within seconds, the initial requested input is then outputted into the form of words and actions. The patent application also describes the “parts” of a virtual assistant: input, output, storage, and memory – which are the four core “interactions – followed by the overall processor that decodes and recodes the input, and lastly the overall machine itself which is the intelligent automated assistant. It’s important to recognize that all parts of a virtual assistant work together in a network to achieve the common goal at hand. That’s what makes it an intelligent machine learning service.

Work Cited:

https://en.wikipedia.org/wiki/Google_Assistant

https://assistant.google.com/

https://patents.google.com/patent/US9318108B2/en

Google Translate’s Next Level Neural Machine Translation System

Google Translate’s system most likely works through extensive classification algorithms of all the languages that they support on their system. Classification functions as a type of algorithm that categorizes the features of data and stores it for both machine learning and retrieval once the application is in use.

It was stressed within the CrashCourse Machine Learning video that conceptualizing the process of machine learning and how fast AI truly computes machine learning translation is impossible due to how sophisticated it is. This is evident, based on how Google Translate’s user interface provides a response to application users in almost real time – much faster than having to whip out a translator book. As defined in the CrashCourse videos, machine learning is a set of techniques that sit inside the even more ambiguous goal of artificial intelligence. Those set of techniques are made up of various different components that contribute to both machine learning and intelligence for artificial intelligence. Users input a set of strings that consist of numbers, letters, or punctuations. This set is called an array, which is made up of binary numbers and stored amongst eachother so that once a command is made to access a certain string, it goes straight to that binary code. Structs are also characteristics of machine learning that consist of compound data structures beyond numbers and simplistic data. They store several pieces of data (think a group) and are then inputted into the AI system to then be outputted out. Google Translator’s system must function in queues, where it’s “first-in first-out” fashion, in comparison to stacks that works from top to bottom. Additionally, Google Translator utilizes artificial neural networks to take in what the user is typing and output it into the desired answer. What we don’t see/pay much mind to is the hidden layer in between the input and output that organizes the input, classifies it, and outputs it properly.

In Google’s Neural Network for Machine Translation article, it’s interesting to delve into their concept of phrased-based machine translation, and how that technology has developed into the “Google Neural Machine Translation system to be phrased based, however more colloquial than a solely phrased based system or human translation. The difference between the phrase based and google neural machine translation system is that the Google version scans and classifies each word being translated and then matches it to a weighted distribution over the most relevant words to the target language.

Google Home’s Case Study on Information and Data

A solid case study that can be conducted is on the Google Home devices — a home smart hub assistant that works through WiFi to work with and for you. Similarly to other industry leading smart home hubs, the Google Home solves the question bared in mind within the Gleick reading: mathematics can be decidable. Based on the ability today to program algorithms that computer systems and AI systems are able to “decide” on their own, they are, in their own way, “thinking” machines. We teach technology / computer systems information that is cultural so that when we communicate or send a message to it, it can give a response or send a message back, and the correct message at that.

The Gleick reading discussed the way in which the Bureau of the Census reported facts regarding communications in the U.S. as a representation of Bell Systems, newspapers, televisions, and radios. An argument was made in which how do we measure “communication?” Postal services counted communication based on letters, but how was the Bell System being measured? With George Campbell being hired, successful and unsuccessful measures of communication through the Bell System were quantified based on the success/failure of telephone transmissions ability to fully connect to signals. Claude Shannon’s initial vision was that communication problems are based off of the lack of accurate interpretation (1984). “Information” became a technical word that quantified information. Terms like force, mass, motion, and time had redefined physics. In this case, information helped redefine our ability to interpret, quantify, and understand communication. The process of mathematizing and quantifying terms such as information and energy was revolutionary, which is what made our world today run on information itself everywhere we go. What was extremely interesting was the articles theoretical question on what creates the relationship between information and people? The suggestion was “entanglement.” An intersectionality between humans and technology — a flow that is both directional and shared — is what makes information technology as advanced as it is today.

Google Home’s may use a mix of both structured and unstructured data (Irvine, Information Communication Theory). The structured data stems from the symbolic messages / information that we send to the device, regardless of whether it is speech or telecommunication. As we continue to use Google Home devices, Google Home begins to learn from us; our behaviors, what we commonly ask, our voice, our interests, and so on. That data is stored within Google Home (similarly to last weeks discussions on Convolution Neural Networks and how we train computational systems to learn information). The unstructured data can stem from the ‘big data’ that is collected from associated accounts to the one that is linked with Google Home. The primary account holder of a Google Home device can have their account linked to different platforms and services, such as YouTube, Amazon, and Gmail. Since data from such affiliated accounts/services is being pooled in together into one group, it is Google Home’s job to have that data readily accessible while users are in charge of initiating the command that may require Google Home to pull out data from the unstructured group. Data that is spewed back at the user, regardless of the messages medium, is based off of stored data that is collected and defined in order to make intelligent responses. Connecting back to the argument made in the Gleick reading about decidable mathematics, Google Home’s data is already predisposed to transmit a particular message based on what message is received, which is possible based on cultural learning that is then machine learned to create the information.

Work Cited:

Martin Irvine, “Introduction to the Technical Theory of Information

James Gleick, The Information: A History, a Theory, a Flood. (New York, NY: Pantheon, 2011).
Excerpts from Introduction and Chap. 7.

AI, Language and Discourse: An Ongoing Story

Boden states that the biggest challenges AI systems face in regards to natural language processes (NLP) is its thematic content and grammatical form. What I wonder, though, is how often should AI NLP’s be updated to reflect up-to-date thematic content? Will colloquial language be reflected in AI’s natural language processing? What constitutes legitimacy over another in regards to different dialects or subcultures that may use their own “language” that an AI’s natural language processor may not register? I think this is a huge field of computer science and linguistics that calls for attention to address as the future of AI development becomes more and more intelligent. Though a potential stretch, I can see a dialogue between the Boden and Alpaydin readings, in which Alpaydin’s discussion on the use of machine learning to understand what kind of data should be given to technology can be similarly applied to the previously stated notions about learning a broader range of how language can be used for AI — research, potentially from linguistics or sociocultural anthropologists, should be conducted to determine the types of “natural language” that AI should process.

Connecting back to last weeks discussion on the blurred line between human and AI, I see a resemblance with how individuals inside and outside of the technologically-aware world define ‘autonomy.’ In regards to the Johnson/Verdicchio article, the importance of reframing AI discourse (especially in the media) is evident. And it is especially significant to create a discussion on what AI truly is (as Johnson/Verdicchio discuss), however I do wonder whether a dialogue between the aware and unaware of the AI industry is in any way intentional. If the  industry’s/society’s sociotechnical blindness to the “human actors” that contribute they make to the development and creation of AI systems is clearly prevalent enough to create an inaccurate discussion of AI systems, then why has there not been any successful means of conveying the truth behind it all? Is it at the fault of AI researchers and their lack of acknowledgement? Is it the fault of popular/western media that enjoys profiting off of AI-Aggressive narratives? Or could it merely be ignorance to it all because the technology is so new?

“Synthesize the Real”

One common theme amongst the introductory readings is the concepts of “natural” and “artificial” (terms used within the Simon reading, however used in different terms within the other readings). Work within the artificial intelligence industry has helped humans identify, articulate, learn, and define our psychological and human behavior. By understanding the contrast/relationships between what is seen as natural and artificial, scientists and researchers can begin to not only gain further knowledge on the capabilities of artificial intelligence, but also further expand on the actions and behaviors that artificial intelligence can achieve. Relevant to the theme of duality between natural and artificial, human and robotic, scientific and technological (Boden), is the theme of time that was prevalent amongst the readings. To understand the fast growing artificial intelligence industry, it must be understood that not only is the turn around speed to learn about artificial intelligence under a quick time frame, but also the rapid speed of learning about artificial intelligence has helped us quickly develop new products and ideas. Further knowledge about the capabilities of artificial intelligence help identify and learn more about categories and subcategories of artificial intelligence (virtual machine, connectionism (Boden)).

To further expand on the previously stated themes, I wonder how the speed of our technological development and knowledge of artificial intelligence will affect the outcome of artificial intelligence in the future. Will the lines between natural and artificial become blurry? Will we, as a society, find a way to justify the algorithmically computed psychological responses, behaviors, and actions that an artificially intelligent entity carries out as normal? Similar concerns point towards issues involving artificially intelligent robots with human features (e.g. Sophia the Robot). Ethically/morally, should there be a line drawn for how much natural characteristics an artificially intelligent entity may have?