Author Archives: Jieshu Wang

The Semiotics of Music: From Peirce to AI – Jessie


Music is the art of sound. It has been along with humans for a very long time and conveys rich meanings among people and through time. Music is considered as a “universal language of mankind” because of its affective power across the boundaries of languages. The power of music roots in its symbolic systems. This paper discusses the semiotics of music in the theoretical framework developed by C. S. Peirce. Then, I examine the similarities and differences between language and music. It is evident that music and language share the same brain area. However, they differ a lot, especially in formal structure. I also analyze how music serves as a cognitive artifact that is distributed among members of social groups. Next, I briefly review the history of music digitalization and see great potential in computer-generated music. I explore the parallel between humans and computers ways of pattern recognition in music and consider this comparison as a useful direction for future studies both in human music cognition and computer science.

1. Introduction

Music is an art form that uses sound as a medium to transfer meanings. We all have our experience about music. Why does music have so much power? The reason lies in the symbolic systems of music. C. S. Peirce, the co-founder of semiotics provided us with theoretical tools to analyze signs, also useful for examining music. In addition, as the most researched branch in semiotics, linguistics has the potential to provide music with useful methods and frameworks; though there are many differences between music and language. Ray Jackendoff and others wrote detailed papers and books in which they looked into the parallels and nonparallels between music and language from an interdisciplinary perspective. This paper also discusses how digitalization impacts the music industry and the great potential in computer composition.

2. Music as a symbolic system

Music has been along with us since the very dawn of human civilization. The oldest musical instrument so far is thought to be a bone flute forty thousand years’ old[1]. Its function remains unknown but probably for nothing more than religious, military, or entertainment purposes.

We all listen to some kinds of music in our lives. Some people listen to music as background when they are walking or driving, while others are immersed in music in concerts or clubs. In addition, music is ubiquitous. Churches, restaurants, yoga clubs, shopping malls, and other places all play different music to build different atmospheres. The secret lies in the rich meanings of music that can be understood by humans. As a symbolic species, we humans can give meaning to and take meaning from all perceptible signals, be it visual, olfactory, tactile or acoustic. Music is an art of acoustics, definitely a symbolic system.

Emotional responses can be evoked by some music, while others tell you stories. Some music could even provide you with religious ecstasy. These meanings are all conveyed through musical signs.

2.1 Peircean Signs in music

C.S. Peirce was known as a co-founder of semiotics, and he developed many valuable concepts and methods in order to study signs. As he put it, “we think only in signs.” Everything could be a sign as long as it is interpreted by someone[2]. Peirce’s theories of semiotics are useful for analyzing all kind of symbolic systems, music included.

Peirce’s model of signs consists of three components—representamen, object, and interpretant.

  • The representamen, also called the “sign” or “sign vehicle” by some scholars, is the form of the sign. In music, the representamen can be many things—the music itself, a movement, a melody, a beat, a genre, a music score, a performance, a recording, a sound effect, the environment of the listener, the stage design, the clothes the performers are wearing, or even a mistake. As long as it is interpreted as something else rather than itself, it could be seen as a musical sign. For example, American ethnomusicologist Thomas Turino once analyzed how musical meanings were built in Jimi Hendrix’s Woodstock performance of “The Star-Spangled Banner.” Many things were considered by him as signs, including Hendrix’s wearing of a seemingly contradictory tuxedo and tennis sneakers, his use of a loud electric guitar with “feedback and distortion,” and the sound effects of airplanes and siren[4]. These signs were interpreted in a meaning-rich context—a specific concert (Woodstock) and a certain historical period with a unique international situation and ideological trend, which had a huge impact on people’s understanding of Hendrix’s music.
  • An object is something to that the sign refers, mainly in the form of an abstract concept. For example, a song called “A Morning of the Slag Ravine” by Joe Hisaishi from Hayao Miyazaki’s film Laputa: Castle in the Sky is a sign for the morning, therefore it’s object is the abstract concept of “morning.” People who are familiar with Hayao Miyazaki’s films would think of the morning sun emerging from the eastern horizon when they hear the song even without visual cues.

A Morning of the Slag Ravine by Joe Hisaishi


Beginning of A Morning of the Slag Ravine

  • An interpretant is a sense in the observer’s mind where the representamen and object are brought together.

In addition, Peirce developed three modes of signs, which represent three kinds of relationship between signifiers and signified[1]:

  • Icon is a mode for resemblance, such as a portrait. For example, at the beginning of Garth Brooks’s The Thunder Rolls, there is a sound effect imitating the sound of thunder, which is an iconic sign.

Garth Brooks – The Thunder Rolls

  • Index is a mode that is “mediated by some physical or temporal connection between sign and objects[3].” For example, ripples on the surface of the water are indexical to the wind. An example in music is the famous beginning of Beethoven’s Symphony No.5 in C minor. It sounds like someone is knocking at the door, but it does not completely imitate the knocking sound, as the beginning of The Thunder Rolls imitating thunder sound. The sound of door knocking is an index for someone at the door. The beginning of Symphony No.5 indicates fate knocking at the door.

Beginning of Beethoven’s Symphony No.5

  • Symbol is a mode in which the sign and object are connected by social convention, such as language and number. In this mode, the relationship between sign and object is arbitrary and must be “agreed upon and learned[2].” According to Thomas Turino, unlike language, most musical signs function as icons and indices, but there also exists many musical symbols[4]. For instance, in his book Signs of Music: A Guide to Musical Semiotics, Finnish musicologist, and semiologist Eero Tarasti gave an example of J. S. Bach’s Fugue in C sharp minor from Book I of the Well-Tempered Clavier, as shown below. For listeners of the Baroque period, this subject is a symbol of “cross and thus the Christ,” which got quoted a lot in other music with the same symbolic meaning[5]. The relationship between this melody and its sign was built based on historically religious convention. Most people in the twenty-first century no longer recognize it as a symbol for Christ. Another example is the beginning of Beethoven’s Symphony No.5 I mentioned before. It became so famous that it was adopted and quoted a lot in other genres of music as a symbol for victory, partially because of its analogue to the Morse code for the letter V—“dit-dit-dit-dah” (another symbolic sign). During the World War II, BBC even used these four notes as the beginning of its broadcasting programs[6].
Fugue-subject of Bach’s Fugue in C sharp minor represented cross and thus the Christ in the Baroque period. Credit: Eero Tarasti

Fugue-subject of Bach’s Fugue in C sharp minor represented cross and thus the Christ in the Baroque period. Credit: Eero Tarasti

Musicians themselves could become symbols, too. A perfect example is Sixto Rodriguez, the American singer depicted in the 85th Academy Award Winner film Searching for Sugar Man. He remained unknown in his home country but had earned significant fame in Australia, Botswana, New Zealand, Zimbabwe, and especially South Africa, where he became a symbol for anti-Apartheid activities and influenced many musicians protesting against the government[7]. Another example is Jian Cui, the first and the most famous rock star in the mainland of China. He was called the Godfather of Chinese rock and roll. His songs and performances teem with music signs. For instance, he was well-known for covering his eyes with a piece of red cloth in his performances. And he also was very good at combining western rock music with Chinese traditional instruments like suona horn, building an unexpected conflicting but subtly harmonious atmosphere. These behaviors were interpreted as signs of a rebel against traditional values and political realities. “I covered my eyes with a red cloth to symbolize my feelings,” he said in an article for Time in 1999[8]. He was considered in China as a symbol of rock spirit and freedom.

Jian Cui was singing with a piece of red cloth covering his eyes.

Jian Cui was singing with a piece of red cloth covering his eyes.

Another noticeable symbolic system in music is the musical marks and symbols we use to notate music in scores. There are a lot of musical notations based on historical and cultural conventions that should be learned in order to understand. The most widely used method today is five-line staff that originated from Europe. However, in ancient China, people used a totally different music notation called “Gongchepu” (工尺谱). Unlike five-line staff that mostly uses non-word characters, Gongchepu uses Chinese characters and punctuations to notate music, and it was written from top to bottom and then from right to left, just like ancient Chinese writings on bamboo slips, another demonstration for the arbitrariness of symbols.

Two scanned pages from the second volume of a song named “阳关三叠” in the scorebook using Gongchepu notation written by He Zhang in 1864. Credit: Wikimedia

Two scanned pages from the second volume of a song named “阳关三叠” in the scorebook using Gongchepu notation written by He Zhang in 1864. Credit: Wikimedia

The first two lines of the same song “阳关三叠” using five-line staff notation. Credit:

The first two lines of the same song “阳关三叠” using five-line staff notation. Credit:

However, as Daniel Chandler pointed out in his Semiotics: The Basics, the three Peircean modes are not mutually exclusive. That is to say, a sign could be any combination of the three. For example, as I mentioned before, the beginning of Beethoven’s Symphony No.5 is an index for someone knocking at the door, as well as a symbol for victory.

Another effect characterizing the interaction among different musical signs is the “semantic snowballing” proposed by Thomas Turino[4]. He suggested that music can simultaneously comprise many signs which interact with one another and mix together as time goes on, like snowballs. Moreover, musical semiotics has a chain effect, in which the object of a sign becomes the representamen of another sign and so on[4]. I find this snowballing and chain reaction happening not only inside the music domain but also between music and other symbolic systems. For example, in the Chinese Gongchepu notation, there are two symbols for beats—“板” (bǎn) notated as “。” for strong beats and “眼” (yǎn) notated as “、” for weak beats. Gradually, the two characters of “板” and ”眼” became symbols for beats and many idioms were developed based on them. For instance, the idiom of “一板一眼” that literally means strictly following the standard beat is used as an adjective to describe scrupulousness and stiffness. Likewise, the English word “offbeat” that originally means not following the beat also became a synonym for “unconventional.” Thus, music symbols snowball into linguistic symbols.

An interesting theory stressed by Thomas Turino in his Signs of Imagination, Identity and Experience is the identity-building function of musical indexical signs in “high-context” communication, as Edward Hall called it. A shared music-related experience among members in an intimate social group could serve as a source of the affective power. In this way, great meanings are stored in and transmitted through music[4]. For example, a couple who watched Titanic as they were dating for the first time might consider the song My Heart Will Go On as an index for their relationship in their later lives.

Similarly, I personally find that game music has an extraordinarily strong indexical power. Many people find the background music of Super Mario Bros and Contra evokes an intense reminiscent mood. The reason, I think, lies in the repetitive strengthening of the immersive game experience. As Koji Kondo, the composer for the soundtrack of Super Mario Bros put it, he had two goals for his music: “to convey an unambiguous sonic image of the game world,” and “to enhance the emotional and physical experience of the gamer[9].” This emotional enhancement is so strong that sometimes they got upgraded to the level of symbol. The quality of the synthesized tones in early game music (a “qualisign” in Peirce’s theoretical frame[10]) is considered as the token for a music genre called “chiptune” or “8-bit music,”which greatly impacts later electronic dance music[11].

2.2. A linguistic perspective of music

In human symbolic systems, the most sophisticated and characteristic one is language. Language is like the jewel in the crown of human cognition. According to American linguist Ray Jackendoff in his book Foundations of Language, human language is really unique and much more complex than other communication systems such as the sound of whales and birds because human utterance could transmit unlimited information with unlimited and arbitrary forms from limited rules and lexicon[12]. Likewise, as another equally sophisticated symbolic system, music shares many features with language. In addition, music seems more competent in some ways such as emotional arousal or affect enhancement[10]. It is even able to function across the boundaries of languages. It is evident that music can induce universal emotion-related responses[13]. Therefore, many people consider music as another kind of language, even a “universal language of mankind[14].”

But strictly speaking, how similar is music to language? Can linguistic methods be used to approach music? How could a linguistic perspective help us understand music? Many types of research have been conducted with this theme.

In my Music as a Language, one of my weekly essays of the course of Semiotics and Cognitive Technologies, I discussed some similarities between music and language. For example, they both consist of sequences of basic elements of sound such as the phoneme in language. They both have structural rules, for example, syntax in language and chord progression in music. In addition, people from different areas tend to develop their own dialect and grammar both in language and music[15]. However, I didn’t examine them in detail and didn’t inspect their differences either.

In his Parallels and Nonparallels between Language and Music, Ray Jackendoff discussed his detailed observation of music and language in many aspects including general capacities, ecological functions, and formal structures[16]. He emphasized that language and music differ in their functions in human life. As he put it, language can be put to both propositional and affective use while music can only convey affective content, though sometimes the distinction between them blurs, for instance, in poetry.

In particular, Jackendoff reiterated the generative theory of tonal music (GTTM) proposed by himself and music theorist Fred Lerdahl[17]. He considers metrical grid as a capacity shared by music and language in the rhythmic domain, but saw no credible analogue in the use of pitch space, even in tone languages such as Chinese and many West African languages. So he concluded that the capacity of the use of linguistic pitch is entirely different from that in music.

However, other pieces of literature provide evidence pointing to the other way. In his Musicophilia: tales of music and the brain, British neuroscientist Oliver Sacks scrutinized the intriguing correlations between musical absolute pitch (AP) and linguistic background[18]. He specifically mentioned the research conducted by Diana Deutsch, a cognitive psychologist at the University of California, San Diego, and colleagues. Deutsch observed, “native speakers of tone languages – Mandarin and Vietnamese – were found to display a remarkably precise and stable form of absolute pitch in enunciating words[19].” By further detailed comparative study on AP in two populations of first-year music students—one at the Central Conservatory of Music in Beijing and the other at the Eastman School of Music in New York, Deutsch found the percentage of students possessing AP in Beijing was way higher than American students—60% vs. 14% in the group in which students began their music training at the age of 4 and 5, 55% vs. 6% in the group of age 6 and 7, and 42% vs. 0% in age of 8 and 9 group[20]. It’s clear evidence that the capacity of musical pitch correlates with that of linguistic pitch.

Back to Jackendoff’s theory, despite the structural rules such as bars and chord progressions, he didn’t think music has a counterpart to the linguistic syntactical structure as complex and strict as in language. Even GTTM’s prolongational structure suggested by himself that has a similar “recursive headed hierarchy” like language was not considered as a common ground between music and language. However, partially inspired by the hierarchical structure of actions in robotics, Jackendoff suggested that complex actions that integrate “many subactions stored in long-term memory” has the potential to serve as a candidate for a “more general, evolutionarily older function” shared by language and music since they are evidently all implemented in the Broca’s area of the brain[21].

Following this action-related road, Rie Asano and Cedric Boeckx took “the grammar of action” into account and developed a more general syntactical framework in terms of “action-related components,”in which they suggested the difference between the syntax of music and language boils down to their different goals in hierarchical plans for actions[22].

I conclude that linguistic methods might provide us with useful perspectives to understand music because they share many cognitive capacities. But we have to remember, music and language are two different symbolic systems with their own features. Just as philosopher Susanne Langer rationally put it, symbolic systems other than languages don’t have “vocabulary of units with independent meanings” and their laws are entirely different from the linguistic syntax that governs language. Therefore, we should not blindly apply linguistic principles and methods on other media such as photography, painting, and music[2].

2.3. The signs of genres

Despite so many differences between music and language, they both tend to develop many dialects or genres as we call them in music. Basically, a music genre is a conventional category of music that shares some recognizable features and patterns. Those features and patterns are musical signs too.

The signs that we use to determine genres vary a lot. Sometimes a genre is recognized through the music instruments, or more precisely, the sound quality used by musicians. For example, heavy electric guitar (sometimes distorted) may indicate rock music, while a song using acoustic guitars probably is country music.

Chord progressions sometimes serve as signs for genres. For example, the 12-bar blues chord progression is considered as a sign for blues music.

An example of a 12-bar blues progression in C, chord roots in red. Credit: Wikimedia

An example of a 12-bar blues progression in C, chord roots in red. Credit: Wikimedia

Scale sometimes has the power to determine a genre as well, especially for traditional music because most modern music uses a diatonic scale. For example, we can easily recognize the Japanese style in the famous piece of Sakura Sakura largely because the unique pentatonic Japanese scale—major second, minor second, major third, minor second, and major third (for example, the notes A, B, C, E, F, and up to A)[23]. Similarly, Chinese music, Indian Raga music, Arabic music, jazz, and blues also have their identifiable scales.

Sometimes the quality of the vocal also allows us to identify the genre. For example, the deep vocal of Amy Winehouse was very soul and jazzy, while the quick rapping vocal of Eminem indicates rap music.

However, digitalization allows for a fusion of music genres. Today, we often hear more than one genre-specific signs in one song. For example, Karen Mok, a Chinese singer from Hong Kong released a jazz album called “Somewhere I Belong” in 2013, in which she adapted twelve songs of different genres into jazz. One of the songs is While My Guitar Gently Weeps originally by the Beatles, in which Mok used guzheng, a Chinese traditional plucked musical string instrument with over 2500 years of history to play it in a jazz style, producing a really creative style of world music.

Karen Mok / While My Guitar Gently Weeps

3. Music as a cognitive artifact

In his Cognitive Artifacts, Donald Norman defined cognitive artifact as “an artificial device designed to maintain, display, or operate upon information in order to serve a representational function[24].” In this view, just like language, music is definitely a cognitive artifact.

First of all, music has many cognitive functions. When we offload cognitive efforts onto music, the performance of the whole system is improved. The reason partially lies in the affective power of music. For example, religious music helps gather people together to form a community with a transcendent purpose without much verbal persuasion. Love songs enhance lovers’ emotion, positively or negatively. Music could even serve as a political weapon. Rodriguez’s music I mentioned before is an example of this.

Second, music can transmit information and emotion through space and time. Gloomy Sunday transmits a gloomy mood to many people in different parts of the world and even was blamed for several suicides according to some urban legend. Johnny B Goode stores a story from the 1950s inside it and still passes on the information to people in the twenty-first century.

In addition, music plays an important role in every culture. It helps build a collective identity. Even four-month-old infants can recognize and prefer the music of their own culture, according to research[25].

Therefore, music becomes a perfect example for distributed cognition since it could distribute across the members of social groups, coordinate between internal and external structure, and distribute through time[26], as we analyzed above.

4. Digitalization of music

From the 1950s, musicians started to use electronic instruments to record, produce, transmit, and store music. Fourier Transform was used to transform acoustic vibration into digital signals[27]. For example, in the 1970s, Alan Kay’s Smalltalk language was used to create programs that captured tones played on the keyboard and then produced editable music scores accordingly with different colors representing different timbres[28].

At first, computers could only complete limited tasks. The range of computer synthesized sounds was restricted. As the computing power got stronger and stronger thanks to Moore’s Law, computers became meta-media and revolutionized the production and distribution of music with new hardware and software tools.

Gradually, computers cannot only imitate existing instruments with increasing precision but also create brand new sound effects and combinations that never existed before. For example, a vocoder is a kind of machine designed to record, compress, and digitalize human voices into editable formats that could be stored and manipulated in unprecedented ways. There is no doubt that digitalization opens up unlimited new possibilities for musical creation. Musicians are provided with a nearly infinite repository of materials. For example, in their song Contact from the album “Random Access Memories,” French electronic music duo Daft Punk used an audio sample from the Apollo 17 mission in which NASA astronaut Eugene Cernan was talking about something strange outside their spaceship. They also used a sample from another song called “We Ride Tonight” by Australian rock band The Sherbs. However, the combination of existing materials creates a novel musical experience.

In this digitalization process, new genres are emerging, such as electronic dance music and ambient music. An interesting demonstration of the power of digitalization is in using software, someone slowed Justin Bieber’s fast-tempo song by eight times, so that it sounds very ethereal like Sigur Rós but nothing like Justin Bieber himself.

Justin Bieber’s U Smile 800% Slower

As I discussed before, music is full of signs, which, in their essence, are recognizable patterns. Computers are good at pattern recognition and matching. One consequence is that computers are more and more capable of identifying patterns that we previously thought could only be recognized by humans. For example, some subtle signs of genres and some musicians’ personal styles can be recognized by computers. Using machine learning algorithms, computers are even competent in “creating” music in certain styles or genres.

One example is David Cope, a composer, and scientist at the University of California, Santa Cruz, who writes programs and algorithms that analyze existing music and create new music in that style. In his patent US 7696426 B2, he described his software Emmy’s logical framework, which contains pattern matching, segmentation step, hierarchical analysis step, non-linear recombination step, and then result in the output. His software takes many factors into accounts, such as pitch, duration, channel number, and dynamics. As he put it, “style is inherent in recurrent patterns of the relationships between the musical events, in more than one work.” Following this logic, based on probability principles, his software is able to capture and rank recurrent patterns as signatures for styles and create new pieces of music[29]. The following video is one of Emmy’s works, which styles in Bach.

Bach-style chorale by musical intelligence computer program created by David Cope. 

Another example is a Beatles-styled song called “Daddy’s Car” composed by an Artificial Intelligence software at SONY CSL Research Lab[30]. Today, some software can produce jazz music too.

Daddy’s Car: a song composed by Artificial Intelligence – in the style of the Beatles

Computers cannot feel music like us. They cannot feel the affection in music, sway to the music, and develop their own preferences, either. They can only break music into 0s and 1s and look for patterns in it. However, I think, the ways they learn and create music are not necessarily different from us. They both involve a process of information storing, retrieving, and pattern matching. They both need to draw patterns from perceivable signals first, then store them in long-term memories, rank them based on how recurrent they are, and match new patterns with those stored memories, although they differ a lot in details. This potential parallel might be a future research direction in order to understand music cognition and develop music-related computer algorithms.

5. Conclusion

Music is a symbolic system that can be approached through C. S. Pierce’s semiotics framework. Despite its similarities with language, music has its unique structure that has no counterpart in linguistics. So it is necessary to develop its own theoretical framework. Music also serves as a cognitive artifact that distributes through time and among people. The digitalization of music creates unlimited possibilities for the music industry, among which, the most noticeable achievement today is the computer-generated music produced by algorithms. I conclude that the computer and the human brain share similar models for musical sign recognition.


[1] Massey, Reginald, and Jamila Massey. The Music of India. Abhinav Publications, 1996.

[2] Chandler, Daniel. Semiotics: The Basics. 2nd ed. Basics (Routledge (Firm)). London ; New York: Routledge, 2007.

[3] Deacon, Terrence William. The Symbolic Species: The Co-Evolution of Language and the Brain. 1st ed. New York: W.W. Norton, 1997.

[4] Turino, Thomas. “Signs of Imagination, Identity, and Experience: A Peircian Semiotic Theory for Music.” Ethnomusicology 43, no. 2 (Spring 1999): 221–55.

[5] Tarasti, Eero. Approaches to Applied Semiotics [AAS] : Signs of Music : A Guide to Musical Semiotics. Berlin/Boston, DE: De Gruyter Mouton, 2002.

[6] MacDonald, James. “British Open ‘V’ Nerve War; Churchill Spurs Resistance.” The New York Times, July 20, 1941.

[7] Bartholomew-Strydom, Craig, and Stephen Segerman. Sugar Man: The Life, Death, and Resurrection of Sixto Rodriguez. London: Bantam Press, an imprint of Transworld Publishers, 2015.

[8] JIAN, CUI. “Rock ‘N’ Roll.” Time, September 27, 1999.,8599,2054475,00.html#ixzz2VFsiqp88.

[9] Schartmann, Andrew. Koji Kondo’s Super Mario Bros. Soundtrack. New York: Bloomsbury Academic, 2015.

[10] Parmentier, Richard J. Signs in Society: Studies in Semiotic Anthropology. Advances in Semiotics. Bloomington: Indiana University Press, 1994.

[11] Driscoll, Kevin, and Joshua Diaz. “Endless Loop: A Brief History of Chiptunes.” Transformative Works and Cultures 2, no. 0 (February 17, 2009).

[12] Jackendoff, Ray. Foundations of Language: Brain, Meaning, Grammar, Evolution. OUP Oxford, 2002.

[13] Egermann, Hauke, Nathalie Fernando, Lorraine Chuen, and Stephen McAdams. “Music Induces Universal Emotion-Related Psychophysiological Responses: Comparing Canadian Listeners to Congolese Pygmies.” Frontiers in Psychology 5 (2015). doi:10.3389/fpsyg.2014.01341.

[14] Colin, Newton. “IN the Words of American Poet Henry Wadsworth Longfellow, ‘music Is the Universal Language of Mankind.’” Sunday Mail (Adelaide), May 16, 2004.

[15] Wang, Jieshu. “Music as a Language – Jieshu Wang | CCTP711: Semiotics and Cognitive Technology.” Accessed December 13, 2016.

[16] Jackendoff, Ray. “Parallels and Nonparallels Between Language and Music.” Music Perception 26, no. 3 (February 2009): 195–204.

[17] Jackendoff, Ray. A Generative Theory of Tonal Music. MIT Press, 1985.

[18] Sacks, Oliver W. Musicophilia: Tales of Music and the Brain. 1st ed. New York: Alfred A. Knopf, 2007.

[19] Henthorn, Trevor, Mark Dolson, and Diana Deutsch. “Absolute Pitch, Speech, and Tone Language: Some Experiments and a Proposed Framework.” Music Perception 21, no. 3 (Spring 2004): 339–56.

[20] Deutsch, Diana, Trevor Henthorn, Elizabeth Marvin, and HongShuai Xu. “Absolute Pitch among American and Chinese Conservatory Students: Prevalence Differences, and Evidence for a Speech-Related Critical Period.” The Journal of the Acoustical Society of America 119, no. 2 (January 31, 2006): 719–22. doi:10.1121/1.2151799.

[21] Patel, Aniruddh D. “Language, Music, Syntax and the Brain.” Nature Neuroscience 6, no. 7 (July 2003): 674–81. doi:10.1038/nn1082.

[22] Asano, Rie, and Cedric Boeckx. “Syntax in Language and Music: What Is the Right Level of Comparison?” Frontiers in Psychology 6 (2015). doi:10.3389/fpsyg.2015.00942.

[23] Harich-Schneider, Eta. A History of Japanese Music. London: Oxford University Press, 1973.

[24] Norman, Donald A. “Cognitive Artifacts.” In Designing Interaction, 17–23. New York: Cambridge University Press, 1991.

[25] Soley, Gaye, and Erin E. Hannon. “Infants Prefer the Musical Meter of Their Own Culture: A Cross-Cultural Comparison.” Developmental Psychology 46, no. 1 (n.d.): 286–92.

[26] Hollan, James, Edwin Hutchins, and David Kirsh. “Distributed Cognition: Toward a New Foundation for Human-Computer Interaction Research.” ACM Trans. Comput.-Hum. Interact. 7, no. 2 (June 2000): 174–196. doi:10.1145/353485.353487.

[27] Chagas, Paulo C. Unsayable Music: Six Reflections on Musical Semiotics, Electroacoustic and Digital Music. Leuven University Press, 2014.

[28] Kay, Alan, and Adele Goldberg. “Personal Dynamic Media.” Edited by Noah Wardrip-Fruin and Nick Montfort. Computer 10, no. 3 (March 1977): 31–41.

[29] Recombinant music composition algorithm and method of using the same. Accessed December 14, 2016.

[30] “AI Makes Pop Music in Different Music Styles.” Flow Machines, September 19, 2016.

Google Art & Culture: a Meta-museum (Jieshu & Roxy)

This week we are going to use a Sanxingdui museum on Google Arts & Culture as an example.

Background of Sanxingdui

Sanxingdui, literally three stars mound, is the name of a Chinese archaeological site and the previously unknown Bronze Age culture for which it is the type site. It is located in Sichuan, China. It is the best place to watch and learn Sanxingdui Culture, which is the heritage of a lost civilization. Google Art Project provides people a chance to enjoy the fantastic artifacts of Sanxingdui without going there.

Home Page of Sanxingdui Museum

In the home page of Google Arts’ Sanxingdui Museum, we found that google will provide this website with the language of your account’s preferences. And this whole website doesn’t apply the choice to choose languages.


Google will present different languages according to your Google account

  • Daily selected Image with a hyperlink
  • Museum name and a brief introduction


  • Two exhibits: the faces of Sanxingdui, and the animals in Sanxingdui. These exhibits are created by expert curators of this gallery. We can totally see how digitalized design impact museums. Instead of the sole arrangements (link to the official website) in real museums, they can arrange these collections in multiple choices.
  • In this Collection. Here we can see albums whose covers are images of a token of this type. These tags are exactly the same tags at the bottom of the page of each item. But we think that this function could be improved, because existing tags are not precise.


  • 98 Items. 98 items with photos and hyperlinks, arranged with a pattern, we tried to analyze the rule of this pattern, but failed.
    • Organize by popularity
    • Organize by color
    • Organize by time


Specific Item

After you click into a specific item, you can see this item with high-resolution. You can zoom in to see the details.


  • High-resolution.Those photos are captured by Art Camera system. A gigapixel image is made of over one billion pixels, and can bring out details invisible to the naked eye. Digitalization preserves the meaning of the artifacts.
  • Private/public collections. Through the ♥️button. It is also a platform for you, after logging in, to collect your favorite collections and make your own exhibits, you can even share it with your friends and families. This practice could democratize art and culture.

Virtual Tour

Google use a system called Trolley to shot photos inside museums. A Trolley is a push-cart mounted with a camera system.


  • Collect images. Trolley equipped with several cameras and sensors.
  • Align images. After they shot the images, they use GPS, speed, and direction information to align imagery. This helps Google reconstruct Trolley’s exact route, and tilt and realign images as needed.
  • Turning photos into 360 photos. To avoid gaps in the 360 photos, adjacent cameras take slightly overlapping pictures, and then they ‘stitch’ the photos together into a single 360-degree images. They then apply special image processing algorithms to lessen ‘seams’ and create smooth transitions.
  • Showing the right images. How quickly Trolley’s lasers reflect off surfaces tells them how far a wall or object is, and enables them to construct 3D models, which determines the best panorama.

We found, however, unlike paintings, because all items in Sanxingdui are three-dimensional objects, displaying them on a two-dimensional screen is not enough to provide us with a sense of reality. Many feelings evoked by three-dimensional structures are lost. In addition, the user experience of the virtual tour is bad, without any feeling of immersion.


  • Google Art Project is a meta-museum. Digitalization of conventional media enables computing devices to become a meta-media. The items in Google Arts Project could be re-arranged in an unprecedentedly easy way. So Google Art Project became a museum of museums—a meta-museum, just as gallery paintings as meta-painting.
  • Google Art Project democratize artworks, just like Samuel Morse’s gallery painting.
  • Google Art Project by far is not competent to mediate three-dimensional objects.
  • The user experience virtual tour needs improving.
  • Deep Remix. We are also fascinated by Google Art Experiment, where items are used to create beautiful patterns like waves according to their shapes or colors, although in the timeline of Free Fall experiment, we couldn’t find items from Sanxingdui Museum. It’s a demonstration of the concept of deep remix mentioned by Lev Manovich in his Software Takes Command.

P.S. We really like to share with you this video, which is another example of meta-media.

The Nature of Representation: Some Primitive Discussion – Jieshu

Learning this course enables me to observe technologies from a brand new perspective. A lot of technologies are designed as cognitive-symbolic artefacts, especially the ubiquitous computing devices that root in human meaning systems. Here I will try to synthesize things I have learned to discuss computation primitively.

The design of computers is based on human meaning systems, which use signs to represent physical objects as well as abstract ones. From Morse Code, it has been common to use electricity to represent information. Shannon made it possible to break down analog signals into discrete ones for better and faster representation. Alan Turing defined computation and proposed Turing machine as a universal model for computation. All of these are correspondent with C.S. Peirce’s observation of human meaning systems. For Peirce, “meanings are chains of inferences from things physically given to cognitive responses[i]”. Another demonstration is computer language, in which syntaxes are assigned meanings to accomplish complex tasks.

It is said that the most fundamental difference between human and animals is the ability to make cognitive artefacts, which are designed to “maintain, display, or operate upon information in order to serve a representational function[ii]”. This assertion emphasized the importance of human meaning system again, which was epitomized by today’s computing devices. In order to offload human cognitive efforts and augment human intelligence, many beautiful visionary concepts were produced by pioneers, including the memex by Vannevar Bush and the Dynabook by Alan Kay, as we mentioned in past weeks. As the Internet comes to age, we distributed countless cognitive efforts onto computers, whose encyclopedic, spatial, procedural, and participatory affordances allow for uncountable achievements[iii].

However, computation is not necessarily all about computers, as Peter Denning argued in his What is Computation. He suggested a computational model of representation-transformation, which “refocuses the definition of computation from computers to information processes[iv]”. In this sense, computation even goes beyond the scope of human cognition, stepping into natural processes such as DNA transcriptions. This made me ponder about the nature of representation. If DNA transcriptions are seen as computation, is representation a natural or artificial process?

According to Denning, “a representation is a pattern of symbols that stands for something[iv]”. In this sense, representation definitely can be natural. Every pattern that stands for another thing can be seen as a representation, be it a piece of DNA sequence governing the production of a specific amino acid, an ultraviolet pattern indicating the center of a flower to a bee, an electron running around an atomic nucleus, or a neural firing caused by a potential difference and inducing a cascade of firings that lead to a specific thought in my brain. All these behaviors that translate external patterns into corresponding responses, I think, can be seen as information processes and instances of representational-transformation model, therefore a computation. They also meet the requirement of the interactive machine, because unlike Turing machines, they are dynamically interacting with the external environment[v]. Being a little infinitely regressive, I come to a conclusion that there’s nothing in the universe is not a computation. So, the whole universe is in a constant computational process.

Zooming in and getting back to the human scale, I suddenly realized that human meaning system might also be a natural representation, translating natural patterns into meaningful signs in the form of neural firing patterns and unceasingly creating new signs with other signs, in order to fully comprehend/represent the external world.

I know this thought is unconventional. Representation is mostly seen as a human cognitive activity. But if we broaden our definition for representation and computation into the non-human scope, we might get some insight of how to improve existing or design new computational paradigms.


[i] Irvine, Martin. n.d. “The Grammer of Meaning Making: Signs, Symbolic Cognition, and Semiotics.”

[ii] Norman, Donald A. 1991. “Cognitive Artifacts.” In Designing Interaction, 17–23. New York: Cambridge University Press.

[iii] Murray, Janet H. 2011. Inventing the Medium : Principles of Interaction Design as a Cultural Practice. Cambridge, US: The MIT Press.

[iv] Denning, Peter J. 2012. “Opening Statement: What Is Computation?” Computer Journal 55 (7): 805–10.

[v] Wegner, Peter. 1997. “Why Interaction Is More Powerful Than Algorithms.” Commun. ACM 40 (5): 80–91. doi:10.1145/253769.253801.

On the Road to Metamedia – Jieshu & Roxy

According to the reading and our group discussion, we identified some concepts and technologies that enabled modern computing devices to become mediating, mediated, and metamedia platforms.


GUI: The concept of GUI developed by Engelbart, Butler Lampson[i], Kay, and others allows everyone to easily navigate computing systems, thereby mediating other media.

OOP: Object-oriented programming (OOP) is a programming language model organized around objects rather than “actions” and data rather than logic. It is this concept that makes sure that your program could grow in size and complexity as well as keep in short and simple.

Simulation: Another concept that makes computing devices a platform of metamedia is the concept of simulation. As Manovich put it, “Alan Turing theoretically defined a computer as a machine that can simulate a very large class of other machines, and it is this simulation ability that is largely responsible for the proliferation of computers in modern society[ii].”

Supporting Technologies

As a metamedia platform, a computer can finish the procedure of inputting, editing and outputting.  This procedure requires supporting technologies, such as transistors, the Internet, and the technologies of digitization, of sampling, of compressing, of software, and of display .

Sampling and Digitization: Technologies such as Fourier transform that “decomposes a function of time into the frequencies that make it up[iii]” enable us to convert between analog and digital signals. This ability allows for easy sampling, digitization, manipulation, storage, and transfer of media information in a high fidelity. For example, we can easily digitally and discretely capture an image on a paper magazine using a scanner that assigns three numbers representing RGB values to each pixel in order to store in a hard disk, represent on a display screen, and transfer to another computer the image. The sampling process is not perfect, for the scanner has a limit in resolution. Information beyond the highest resolution is lost. But it persists meanings that could be understood by human beings. This kind of sampling and digitization enables a computer to become a platform for all kinds of media, transforming from a “universal Turing machine” to a “universal media machine[ii]”.

Compression: File compression can reduce storage space and transmission time. One way compression works is by taking advantage of redundancies. “most computers represent text with fixed-length codes. These files can often be shortened by half by finding repeating patterns and replacing them with shorter codes[iv]”. If you store the same photo both in JPG format and BMP format, you will find that the ratio of sizes of the photo is 16:1. It means that you can store 15 times more JPG photos than before and when you upload them to the internet, it will only take you 1/16 time compared with the past. So that we could better edit files on computing devices, and turn a computing device into a metamedia.

Storage: From tapes to disk, to flash memory, and to cloud storage, these storage technologies help computing devices to store more files and help files in different formats to be displayed on one device at one time.

Software and Algorithms: In his Software Takes Command, Manovich stressed the importance of software, which in his opinion is where the “newness” of new media liesii. With software, we can easily manipulate existing media, and new properties could easily be added to existing media. iMovie, Word, Photoshop, Audition, CAD, 3D Max… These software enable average people to create media content in a way that was only accessible to professional users in the past. In addition, new software and new tools are constantly being created. For example, with C++ and other programming software, game designers produced many computer games, a new  genre of software. Reddit, a social news aggregation website that was used to share media was programmed with Python. Thus, computers become what Kay and Goldberg coined as “metamedia”[ii].

Transistors:  The constant miniaturization of transistors for the past decades exponentially enhanced computing power, as well as the competency to deal with media content, thanks to the Moore’s Law. Ten years ago, exporting a twenty-minutes 720p video file would cost my desktop computer two hours. Right now, my MacBook can easily edit 1080p videos in real time. It is largely because of the increase in computing power of computer chips.

Internet: From Engelbart’s oNLine System (NLS), we have achieved huge progress in linking computing devices together. With the rise of mobile internet, we are exposed to an increasingly ubiquitous computing environment. We constantly edit and share media content on the internet. We send pictures to friends and share texts and music on our social media every day. Recently in China, online live video broadcast becomes very popular. People are so fascinated by sharing their own and watching other people’s everyday lives that some popular live hosts’ value even raise to five million dollars.

Display: Right now we have many display technologies that fulfill different demands. The resolution of the display increased a lot, while the sizes of monitors are getting smaller and thinner. Along with increasingly powerful graphics processing units (GPU), this trend enables computing devices to represent media content with higher and higher fidelity, allowing for more and more sophisticated media manipulation. Here we’d like to emphasize two display technologies.

  1.  The first one is electronic ink (E Ink) used in Amazon Kindle. We think E Ink technology meets the requirements that Alan Kay envisioned with his Dynabook[v].  He suggested that in order to use the Dynabook at any places, CRT was not preferred. He envisaged a display “technology that requires power only for state changing, not for viewing—i.e. can be read in ambient light.” E Ink technology definitely meets his requirement, saving power and tremendously extending the time between battery chargings. The use of E Ink on Kindle remediates the functions of books.
  2. The second one we’d like to talk about is touch screen technology that was initiated by Ivan Sutherland[vi] and that could easily “engage the users in a two-way conversation” envisioned by Kay[vii]. Touch screen technology allows us to directly interact with computing systems and facilely manipulate media files.

Unimplemented Dream

One of Alan Kay’s design concepts for access to software that has not been implemented is that everyone should learn how to program.

According to Alan Kay, a programming environment, such as programs and already-written general tools, can help people to make their own creative tools. In his prediction, different people could use a mold and channel the power of programming to his own needs. (software takes command) In addition, it can also help people to build the computing thinking, also be known as the complex-problem solving skill, since computing language is a procedural language. This mission has not been achieved yet. Nowadays, people still see programming as a task only can be solved by experts.

What We Want

In our discussion, we imagined a lot of interfaces, going beyond any commercial products we are using today, including Augmented Reality (AR) like Magic Leap and Microsoft’s HoloLens, Virtual Reality like Oculus Rift, MIT’s Reality Editor, eye tracking interface that could used by ALS patients, maps projected to the windscreens, Ray Kurzweil’s mind-uploading Nanobots, and virtual assistants that understand natural language such as Siri and Cortana. While we are busy with taking notes of our ideas, we couldn’t find a perfect note application with which we could not only type words but also draw sketches, build 3D models, record and edit audio and video clips. In other words, there’s no application that could deal with all media formats. So, here we describe an interface of a note application.It’s much like the system that Engelbart presented in “mother of all demos” in 1968[viii].

We totally understand that modern software are developed by different companies who have financial interests, therefore, they need to close their systems in order to lock users in. For example, a Photoshop PSD file cannot be read and edited in CAD software. The interface we envisioned could melt the boundaries between different software, enabling us to easily manipulate any categories of media, combining flow charts, pictures, texts, sound, and other media together without switching software.

For example, when we are taking notes in CCTP-711 class, we can type in Professor Irvine’s words in the interface. We can also record his sounds and have it translated into words with a built-in speech recognition module. When he talk about the “mother of all demos” video, we don’t need to minimize the app window and watch it on Youtube in a web browser. Instead, we can insert the video into our notes without switching to a web browser. When he talks about some ancient semiotical artifact, we can easily insert its 3D model into the edit area without installing any cumbersome 3D software like CAD. These media objects could be edited and rearranged at any time afterward. In a nutshell, it’s a knowledge navigation system.

To achieve this interface, we think there are something that needs to be done. First of all, media software should open their source code, or at least provide more open APIs to developers. Second, more powerful computing ability is needed in order to process so many media at the same time. Third, we also need cloud computing and fast networks to store and retrieve so much information quickly. Fourth, machine intelligence was needed to process natural language and respond you in a more natural way.


[i] Lampson, Butler. 1972. “GUI Designer at Xerox PARC.”

[ii] Manovich, Lev. 2013. Software Takes Command. International Texts in Critical Media Aesthetics, volume#5. New York ; London: Bloomsbury.

[iii] “Fourier Transform.” 2016. Wikipedia.

[iv] Denning, Peter J., and Tim Bell. 2012. “The Information Paradox.” American Scientist 100 (6): 470–77.

[v] Kay, Alan. 1972. “A Personal Computer for Children of All Ages.” Palo Alto, Xerox PARC.

[vi] Sutherland, Ivan. 1963. “Sketchpad: A Man-Machine Graphical Communication System.”

[vii] Kay, Alan. 1977. “Microelectronics and the Personal Computer.” Scientific American 237 (3): 230–44.

[viii] “CHM Fellow Douglas C. Engelbart | Computer History Museum.” 2016. Accessed October 31.

Unfulfilled Intelligence Augmentation Visions – Jieshu

In Wikipedia’s disambiguation page for Interface, you can see many interfaces—user interface, hardware interface, biological interface, chemical interface, and social interface. There even exists a place in Northern Ireland called Interface Area where “segregated nationalist and unionist residential areas meet”[i], which intuitively reveals the implication of interface—boundaries. I’d like to quote words from Professor Irvine’s introduction essay—“an interface is anything that connects two (or more) different systems across the boundaries of those systems[ii].”

In our symbol systems, basically, anything can serve as an interface as long as it mediates between “social-cultural” systems. Ancient cave paintings are interfaces connecting people and their long lost myths. The Bible is an interface to a huge system of culture and value. An abacus is an interface to an ancient calculation system. In computing devices, a user interface is a boundary between the user and the computing system.

Interaction is the function of interfaces. Information flows through interfaces. However, many interfaces are like one-way streets, allowing information flowing toward only one direction. For example, an artwork is an interface to a meaning system, but many artworks only allow information flowing toward the audience. We have to stand behind a line when we are appreciating Von Gogh’s Starry Night. But artists are increasingly creating more and more interactive works, e.g. the pile of candies that people were free to take away mentioned by Amanda in her post several weeks ago[iii].

So when I first read about how the pioneers like Kay, Sutherland, Engelbart, and Bush explored improving man-machine interaction using computational interfaces, I was truly amazed. Bush suggested the model of memex to store, record, and retrieve books[iv]. Engelbart envisaged a computer network that could augment human intelligence. He also invented the mouse that could point to anywhere on computer screens[v]. Licklider envisioned a man-computer symbiosis system for more effective man-computer interaction[vi], which was realized partially by Sutherland with his Sketchpad that allowed people to use a light pen to draw on computer screens and that could modify your drawings to perfect circles or rectangles, realizing true two-way interaction[vii].

Many of those pioneers’ visions have been explored, as computing power gets stronger and stronger. For example, CAD software and devices with touchscreens are all rooted in the concepts proved by Sketchpad. Mobile devices like cell phones, Kindles, and iPads are pretty much resembling Kay’s Dynabook. Online file systems envisioned by Engelbart that allowed multiple users to read and edit at the same time are recently implemented, such as Quip and Google Doc.

However, there are many paths remaining unexploited. Here I will discuss three of them.

Knowledge Navigation System

Memex suggested by Vannevar Bush put forward a personal information management and “knowledge navigation” system. I was surprised by how much cognitive workload could be offloaded onto this system, although it was designed seventy years ago with the absent of digitalization. Even today, when everyone has their own computer(s) and multiple external hard disks, we haven’t built a highly efficient knowledge navigation system. Maybe Wikipedia is close, but it can’t present your personal knowledge structure. In my view, a true knowledge navigation system should have the following properties:

  • Portability. Cloud storage might be a good choice.
  • Searchability. You can search any word, image, soundtrack, even video and get everything relevant quickly.
  • Present your knowledge structure easily. It could use methods like data visualization or library classification to present your knowledge structure and allow you to navigate your knowledge landscape both horizontally and vertically. In other words, you can zoom in and zoom out on your “knowledge map” and see your knowledge on different scales, like a Google Map for knowledge.
  • Knowledge discovery. You can use it to discover new knowledge. This also reminds me of Google Earth, which is a good example of knowledge discovery. For example, when you zoom in on the Pacific Ocean, you could see many islands. According to the layers you choose, through clicking icons distributed on the map, you can discover various kinds of knowledge, such as Wikipedia entries about this island, three-dimensional topographies both on land and undersea, and documentary videos of marine animals shot by BBC or Discovery Channel. If you zoom in on the Mars in Google Earth, you can learn tons of knowledge like what chemical and physical factors shaped some strange geographic feature. The hyperlink in Wikipedia is also a good way of knowledge discovery. This property is realized through hyperlinks and network with huge online databases.
  • Connect to other people’s knowledge system. You can share knowledge with other people, navigate knowledge on your social network, and at the same time navigate your social network on the entire human knowledge landscape.

Ray Kurzweil, a futurist of Google once predicted that in thirty years, human beings can use nanobots in our brains to connect to the Internet and conduct many crazy functions[viii], such as downloading function modules according to your needs like The Matrix. It sounds very hype, but may be a good way to realize “Knowledge Navigation.”

Virtual Personal Assistant

In his The Computer as Communication Device in 1968, Licklider mentioned OLIVER (on-line interactive vicarious expediter and responder) proposed by Oliver Selfridge. OLIVER was “a very important part of each man’s interaction with his online community.” According to Licklider, OLIVER was “a complex of computer programs and data that resides within the network” that could take care of many of your matters without your personal attention. It even could learn through experience. This path is a very typical “Intellectual Augmentation” method, having been explored in apps like Siri and Microsoft Cortana. Another example is Amy, a virtual assistant that are able to arrange your schedule using information drawn from your emails with natural language processing[ix].


Ccing to Amy would allow it to arrange your schedule according to your time and location.

However, because the algorithms are still in their primitive stages, this path still has a very long way to go.


Alan Kay coined the term metamedium for computers that serve as media for other mediaii. He also envisioned a system with software that allowed everyone including children to program their own software as “creative tools.[x]” This path was exemplified in his SmallTalk project. It is under-exploited today. Most computer owners including me don’t know how to program. Computing devices are mainly consuming devices. As we discussed in this week’s Leading by Design course, open source software that would free us from lock-in systems like Microsoft Windows and OS X may be a way to realize Kay’s vision. Another way I can think of is to teach children to program with interesting tools such as LEGO and Minecraft, which might be a commercially plausible approach.


[i] “Interface Area.” 2016. Wikipedia.

[ii] Irvine, Martin. n.d. “Introduction to Affordances and Interfaces: Semiotic Foundations.”

[iii] Amanda Morris. 2016. “Using the Piercian Model to Decode Artwork – Amanda | CCTP711: Semiotics and Cognitive Technology.” Accessed November 3.

[iv] Vannevar, Bush. 1945. “As We May Think.” Atlantic, July.

[v] Engelbart, D. C., and Michael Friedewald. n.d. Augmenting Human Intellect: A Conceptual Framework. [Fremont, CA: Bootstrap Alliance], 1997.

[vi] Licklider, J. C. R. 1960. “Man-Computer Symbiosis.” IRE Transactions on Human Factors in Electronics HFE-1 (1): 4–11. doi:10.1109/THFE2.1960.4503259.

[vii] Sutherland, Ivan. 1963. “Sketchpad: A Man-Machine Graphical Communication System.”

[viii] Kurzweil, Ray, and Kathleen Miles. 2015. “Nanobots In Our Brains Will Make Us Godlike.” New Perspectives Quarterly 32 (4): 24–29. doi:10.1111/npqu.12005.

[ix] “Testing Amy: What It’s like to Have Appointments Scheduled by an AI Assistant.” 2015. GeekWire. December 15.

[x] Manovich, Lev. 2013. Software Takes Command. International Texts in Critical Media Aesthetics, volume#5. New York ; London: Bloomsbury.

Python Has Much in Common with Natural Languages – Jieshu Wang

I have learned approximately 40% of the Python course on CodeAcademy. It’s not much, but enough to prompt me to retrospect the ideas we have learned so far, including those about linguistics, distributed cognition, information theory, and the computing principles mentioned in this week’s reading.

Like natural languages we discussed in earlier weeks, programming languages have tripartite parallel architectures mentioned in Ray Jackendoff’s Foundations of Language[i]. Programming languages are made of basic elements of meanings called primitives[ii]. Primitives are predesigned symbols that mean things or do things. For example, strings and variables are symbols that mean things. We can assign meanings to them or change their meanings later. “True” and “False” are Booleans, a kind of primitive that represent truth value[ii]. “print” is a primitive procedure[ii], and it means to display the strings after it on the screen. “import” means pulling modules or individual functions into current editing context. “%” in calculation means calculating modulus, while in strings it is a placeholder whose meaning will be assigned immediately after the string.

Likewise, primitives are organized with syntaxes. For example, equals signs are used to assign value, such as “spam = True”. Triple quotation marks are used to add comments. “else”s should come after “if”. A function definition must be followed by a colon. Parentheses have to come in pairs. However, the syntaxes in programming languages are much stricter than those in natural languages. When you are speaking natural languages, you don’t have to precisely grammatical in order to be understood. But if you lose just one colon after the “if” statement, your entire section of codes couldn’t be interpreted by Python. Thanks to the programs running behind the online Python testers, we could easily identify where the errors locate.

屏幕快照 2016-10-26 下午7.03.08

A Python online tester could identify mistakes.

Even so, programming languages share the property of arbitrariness with natural languages, as Prof. Irvine mentioned in this week’s Leading by Design course. That’s because you can write so many different versions of codes to achieve the same goal.

In this week’s reading, one thing that surprises me is that there are so many problems that computers can’t solve. For example, “the only algorithms guaranteed to solve exponentially hard problems are enumeration methods[iii]”, as mentioned by Peter Denning in his Great Principles of Computing. Because the time needed to enumerate exponentially hard problem is too long, we have to use heuristic methods to approximate the best solutions. In other words, there probably exist better solutions than those given by computers. Maybe quantum computers would solve these problems in the future.


  1. Should a programmer memorize all the syntaxes in order to write a program?
  2. Jeannette Wing emphasized the importance of computational thinking. She said we should add computational thinking to every child’s analytical ability[iv]. She also explained what computational thinking is. I was wondering how to build a computational thinking?


[i] Jackendoff, Ray. 2002. Foundations of Language: Brain, Meaning, Grammar, Evolution. OUP Oxford.

[ii] Evans, David. 2011. Introduction to Computing: Explorations in Language, Logic, and Machines. United States: CreateSpace Independent Publishing Platform.

[iii] Denning, Peter J., and Craig H. Martell. 2015. Great Principles of Computing. Cambridge, Massachusetts: The MIT Press.

[iv] Wing, Jeannette. 2006. “Computational Thinking.” Communications of the ACM 49 (3): 33–35.

Distributed Cognition of Luopan, a Feng Shui Compass (Jieshu & Roxy)

Luopan, also called as a Feng Shui compass, is a traditional Chinese compass used for finding the best facing direction in architecture design, for people both living and dead. Feng and Shui, in Chinese, mean wind and water respectively. Feng Shui is “a Chinese philosophical system of harmonizing people with the surrounding environment”, also a symbol system that transforms meanings of direction. The Feng Shui practice involves architecture design, fortune-telling, and even weather forecast in metaphoric terms of “Qi”, the invisible forces that bind the universe, earth, and humanity together. If you have watched Kung Fu Panda 3, you may hear of this term. In the practice of determining the best places and directions of architectures and tombs, Luopan is the inevitable tool because, in Chinese traditional culture, different directions have different meanings. If the tombstone of your ancestor faces a good direction, it is believed that you will have a lucky life.

Luopan looks like a compass, but with many other marks on the concentric circles, besides the four basic directions. It is used to determine the precise direction by a Feng Shui practitioner.



Three levels of distributed cognition of Luopan

Here, the representational practice we’d like to talk about is using Luopan to determine the best direction of architectures. In this case, the cognitive processes were distributed in three ways.

First of all, the cognitive process is distributed across the members of a social group.

  • An individual mind can implement in a group of individuals. At the beginning, some philosopher proposed an idea that Qi is the only thing left after people’s death. The best status of Qi is in a dynamically recycle motion. Wind may disperse the Qi, and water may stop the Qi. The arrangement of the wind and water is the key of sustaining the best status of Qi. This idea was popular in an era when people were unable to tell the mystery of life and nature. So the most respected people, like ancient emperors, preferred that they can still bless their people and descendants after their death. They, then, turn to count on the positions of tombs when they were still alive. This idea became popular in ancient China, forming a special culture, although it’s not scientific.
  • This group cognitive task also influences every individual in this community. All Chinese people know about Feng Shui. Moreover, nowadays, although only a few people claim they believe in Feng Shui, but many of us are familiar with this technique and may check the direction of doors and beds before we move into a new house.

Second, cognitive processes may involve coordination between internal and external structure. In this case, the minds of Feng Shui practitioners are not “passive representational engines” that replicate the external world. On the contrary, they would use the information gathered from the environment to perform some complicated cognitive task. For example, in designing tombs, they could use the direction indicated by Luopan, the landform of the potential tomb sites, the birthday of the dead, and some complicated rules and equations to calculate the best tomb site and the best facing direction of the tombstone. With Luopan, they could find the best way to coordinate the behaviors of Qi, keeping it in the best status. The interaction with Luopan coordinates the relationship between environment and people’s inner status. It is like the blind’s walking stick, the biologist’s microscope, and astronomer’s telescope.

Third, the cognitive processes of Luopan are also distributed through time. Over time, the practice of Feng Shui and the usage of Luopan become a part of Chinese culture. According to Hutchins, on the one hand, culture emerges from the activities in the history. Luopan has a very long history, originated from the earliest magnetic compass thousands of years ago.The culture of Luopan is formed and strengthened in the countless activities of using Luopan to determine lucky directions from ancient to present. On the other hand, the culture of Luopan also serves as a historical context for the future practice of Luopan. Chinese people see the culture of Luopan as a “reservoir of resources” for problem-solving and reasoning that in return shapes the cognitive processes whose practice “transcends the boundaries of individuals”.

For example, I (Jieshu) have a friend who was eager to find a boyfriend. She asked a Feng Shui practitioner to use a Luopan to calculate where her future husband was, and she was told that the lucky direction was the southeast. She literally saw Luopan as a way to solve her problem, and deliberately began to date boys from the southeast. Last week, she told me she established a relationship with a man from Taiwan. This practice of Luopan finally transcends the boundary of her individual self.

During the history of Feng Shui, transformation is constantly happening, as the development of people’s knowledge about nature. More and more people recognize that nature and people’s destiny are not governed by some mysterious energy called Qi. However, nowadays, some people claim that Feng Shui can be explained by modern science. For example, according to Feng Shui, the door of a house should not be opened to the north direction, otherwise, your family will easily get sick. According to proponents of Feng Shui, this principle could be explained by the fact that in winter, cold wind from Siberia blows across the most area of China. If your door is to the north, you will easily get cold. This is an example of the transformation of the culture of Luopan over time, as well as a demonstration of distributed cognition across time.

We also have Luopan app, right now, which can work on your smartphone. But the compass in a smartphone does not depend on the magnetic needle. Instead, there are thin films. Thanks to the Quantum Hall Effect and Magnetoresistance Effect, these thin films can sense the direction of the geomagnetic field and then can translate this sense into the electrical signal which can be read by your smartphone. This information can be showed in numbers and words. But they still use the graph of Luopan and the pointer just as the one in a real Luopan. Why? The reason is as same as the airspeed indicator on plane mentioned in the Distributed Cognition. When people get used to the overt version of indicator, another different form of display may disturb the practitioners’ cognition embodied in the history of Luopan.

Luopan has intersubjectively accessible meanings

As an individual in this community, I (Roxy) share the same value of Feng Shui with other people under this influence. There are two levels.

  • From the royal level, at the beginning, the first emperor decided that he needed Feng Shui practitioners to determine the best place for him to be buried. The emperors after him wanted to get better and fancier places for them to be buried. This idea transformed from a personal idea to an intersubjective common ground.
  • From the folk level. For example, when Xiao Ming, an average person believes in Feng Shui, he will find a Feng Shui practitioner to help him determine the position and direction of the tomb of his deceased father. His idea is transmitted to and interpreted by the Feng Shui practitioner. The Feng Shui practitioner determines the position and tells Xiao Ming the result. This idea is then shared by these two people. After Xiao Ming put this idea into practice, more and more people know it and start to believe in it. Gradually, the usage of Luopan becomes commonly accepted in his community.

In this way, the symbol system of Feng Shui is distributed, and the cognitive task is offload onto Luopan.


[1] Luopan. (2016, October 14). In Wikipedia, The Free Encyclopedia. Retrieved 18:27, October 14, 2016, from

[2] Feng shui. (2016, October 16). In Wikipedia, The Free Encyclopedia. Retrieved 02:44, October 16, 2016, from

[3] James Hollan, Edwin Hutchins, and David Kirsh. “Distributed Cognition: Toward a New Foundation for Human-computer Interaction Research.” ACM Transactions, Computer-Human Interaction 7, no. 2 (June 2000): 174-196.

[4] Jiajie Zhang and Vimla L. Patel. “Distributed Cognition, Representation, and Affordance.” Pragmatics & Cognition 14, no. 2 (July 2006): 333-341.

Meaning Preserving in Communication System – Jieshu Wang

Why can’t we extrapolate from the “information theory” model to explain transmission of meanings?

As Professor Irvine mentioned in yesterday’s Leading by Design class, Samuel Morse was the first person who gave meanings to electronic current pulses. But it was not until Claude Shannon founded Information Theory, had this signal-code-transmission model been formally established as a discipline.

However, Shannon ignored meaning, so it is ambiguous where new information comes from[i]. The information theory he established and its predecessor mathematical theory of communication (MTC) both are not interested in meaning, reference, or interpretation of information. Instead, they mainly “deal with messages comprising uninterpreted symbols[ii]” that are at the syntactic level, not semantic information.

Let’s look at Shannon’s illustration of communication system, the simplest information system.

屏幕快照 2016-10-13 上午1.23.39

Claude Shannon’s original diagram for the transmission model, 1948-49. Source: Irvine, Martin. “Introduction to the Technical Theory of Information.”[iii]

All information, no matter it is an email, a phone call, or a song, is transformed and transmitted from its sender to the receiver through the pattern showed in the image above. For example, the pattern of Morse Code consists of dots, dashes, and spaces that are meaningless before they are decoded. That’s why we can’t extrapolate from the information theory model to explain the transmission of meanings.

Where are the meanings?

During the process of information transmission in communication systems, the meaning is not lost but exists in the sign referent model proposed by Paolo Rocchi[i]. According to Rocchi, information has two parts–sign and referent. “Meaning is the association between the two.” It is learned and stored in our brains and can be transformed and transmitted by machines. Once it is decoded into recognizable signs, the association—meaning—is ready for us to discover.

Those associations can also be conducted by computers and become more and more important for scientific discovery because the association capacity of the human brain is biologically limited. Computers could serve as a good cognitive artifact for us to offload this cognitive effort.

Using simple observation and intuitive induction reasoning conducted while bathing, Archimedes associated the behavior pattern of water with physical forces, and ultimately discovered the law of buoyancy. But modern physics does not work that way. For example, the discovery of gravitational waves earlier this year largely attributed to many sophisticated machine learning algorithms whose job, in a nutshell, were filtering all kinds of noises and screening out the most promising signals picked up by the supersensitive sensors. Basically, we offload the effort of associating signal patterns (sign) with astronomical events (referent) to computers. Computers are making, storing, and looking for meanings on behalf of us.

What is needed to complete the information-communication-meaning model to account for the contexts, uses, and human environments of presupposed meaning not explicitly stated in any specific string of symbols used to represent the “information” of a “transmitted message”?

In order to complete the information-communication-meaning model, first of all, we need a sign system shared by the members of the community. According to C.S. Peirce, the sign system consists of an object, an interpretant, and a representamen[iv]. The object is what the sigh refers, i.e. the referent. The meaning making process is hiding in the relationship among the three parts.

Second, during the design process of communication machine, our idea of the meaning is built into the machine, so that when the machine is used to transform and transmit information, the meaning of the information will be preserved in the association of sign and referent implanted in the machine[i]. For example, when people are designing computer language, they would also design a dictionary in which every code corresponds a specific logic action.

After the information is decoded, the receiver uses the sign system that he/she shares with other community members to interpret the message and finally interprets the meaning.


[i] Denning, Peter J., and Tim Bell. 2012. “The Information Paradox.” American Scientist 100 (6): 470–77.

[ii] Floridi, Luciano. 2010. Very Short Introductions : Information : A Very Short Introduction. Oxford, GB: Oxford University Press.

[iii] Irvine, Martin. “Introduction to the Technical Theory of Information.”

[iv] Chandler, Daniel. 2007. Semiotics: The Basics. 2nd ed. Basics (Routledge (Firm)). London ; New York: Routledge.

Symbolic System in “Bloodline” – Jieshu Wang

One thing I learned from this week’s reading is the relationship between semiotics and linguistics. In his Semiotics: The Basics, Daniel Chandler briefly introduced the history behind these two interconnected disciplines, including the impact to semiotics from Saussurean Linguistics and C.S. Peirce’s triadic innovation[i]. Semiotics borrows many concepts and methods from linguistics, and extends to much broader sign systems. This week I’d like to use the works of Xiaogang Zhang, a Chinese surrealist painter best known for his Bloodline series, as examples to illustrate how we understand meanings from paintings.

Bloodline: Big Family No.1, by Xiaogang Zhang (1996)

Bloodline: Big Family No.1, by Xiaogang Zhang (1996)

Bloodline in Pierce’s model

As you can see, Xiaogang’s paintings have very distinct characteristics, reminiscent of Chinese family portraits from decades ago. The faces in the painting are nearly identical, expressionless, and sad, though they have different genders, clothes and hair styles. They look alike one another partly because they are family. Moreover, there are other meanings.

Decades ago, China went through a very hard time, both materially and psychologically. During that time, individualism was opposed while collectivism was favored by government. People regardless of age and gender tended to wear similar drab clothes—mostly dark green reminiscent of military uniform—to avoid other people’s attention. In addition, pop culture was highly restricted, only a very limited number of songs and movies being allowed to be released. I don’t want to talk about politics, but honestly, this period of time was really difficult for average people, including my family. Taking a family portrait was a big event for most families, so everyone would put on their best clothes and the same grave expressions, almost identical. These portraits are real epitomes of that period of time.

What makes Xiaogang’s works so special is the sign system subtly hiding in his usage of color, shape, and shade. Consider the painting below. I will use Peirce’s triadic system (representamen, interpretant, and object) to discuss. For simplicity, interpretant will be left out.

Bloodline: Big Family, by Xiaogang Zhang (1999)

Bloodline: Big Family, by Xiaogang Zhang (1999)

  1. Sign System 1
    1. Representamen 1: a zigzag red line connecting the three people
    2. Object 1: a symbol for family, where many Chinese traditional values reside, including collectivism.
  2. Sign System 2
    1. Representamen 2: identical faces
    2. Object 2: lack of self-identity, excessive collectivism.
  3. Sign System 3
    1. Representamen 3: gloom color
    2. Object 3: depression.
  4. Sign System 4
    1. Representamen 4: red scarf
    2. Object 4: an icon for Young Pioneer
  5. Sign System 5
    1. Representamen 5: the boy’s face being retouched to brown
    2. Object 5: an oppressed desire to be free.
  6. Sign System 6
    1. Representamen 6: the boy with a unique brown face wearing a red scarf around his neck, which is the only colorful thing in the whole painting.
    2. Object 6: the desire of young people to be different, but ending up with the same institution—red scarf is the index of Young Pioneer, in turn, a symbol of institutionalization. Here, the process of interpretant of last two signs (red scarf and retouched brownness) becomes the representamen of this sign, demonstrating the difference between Saussure’s signified and Peirce’s interpretant, which itself is a “sign in the mind of the interpreter[i]”.
Peirce’s successive interpretants. Source: Semiotics: The Basics.

Peirce’s successive interpretants. Source: Semiotics: The Basics.

  1. Sign System 7
    1. Representamen 7: stains on the faces
    2. Object 7: psychological scars. In China, people always put family portraits under a glass pane on the table. Sometimes, tea would somehow get under the pane and stain the photo. Notice the shape of the stains is sharp like a blade.

Strictly speaking, however, I don’t think the “Parallel Architecture” paradigm can always be extended to sign systems other than language, in which phonological structures, syntactic structures, and conceptual formation structures are interconnected with interface rules[ii].

In paintings, lines and shapes are counterparts of linguistic phonological structures that have no meanings. In Xiaogang’s paintings, the “syntactic structures” are meaningful and interpretable shapes made up of lines, such as the people, the red scarf, and the red line. But the “syntactic structures” of paintings are not universally necessary. Abstractionism has abandoned this intermediate layer between “phonological structures” and “conceptual structures”. For example, Convergence by Jackson Pollock is totally a mess at first glance. No recognizable shapes can be found in the painting. It uses simple “phonological structure” to achieve a conceptual meaning of freedom, let alone Mark Rothko’s works without any specific objects but able to tranquilize a disturbed soul.

Image property of the Albright-Knox Art Gallery, Buffalo, NY.

Convergence by Jackson Pollock. Image property of the Albright-Knox Art Gallery, Buffalo, NY.


Mark Rothko’s works in Rothko Chapel, Huston. Source:


Just as philosopher Susanne Langer said that the law governs their articulation “are altogether different from the laws of syntax that govern language[i]”. Nevertheless, I believe understanding the rules behind sign systems will give us new insights into the rules governing our cognition.

Are We Bayesian Learners?

After reading Prof. Irvine’s In-class group exercise on first steps in semiotic analysis, I got some thoughts and questions. Prof. Irvine mentions that we are all pattern recognizers, and we all have the ability to generalize individual patterns to genres. I think this pattern recognizing ability is exactly where our powerful learning abilities reside. We don’t need a lot of examples to learn the common patterns of a genre. For example, we can recognize a watermelon in a supermarket just after several encounters of pictures of watermelon, even if we never saw a real watermelon before and they all look somehow different in size, color and pattern. We know it is a watermelon at the first glance. But it is really hard for computers to learn a new genre of things. Computer scientists have to label hundreds of thousands of pictures as in-put data for an algorithm to learn what a house or a dog looks like. After tons of hours of data-learning, they even can’t distinguish a dog from a cat. This so-called “supervised learning” definitely is not what our brains use to form conceptions.

However, there’s a research on Science last year, in which the researchers use “Bayesian Program Learning” (BPL) to teach a computer program to learn new written languages. The program captured the features of new characters and learned how to write them only through very few examples, achieving “human-level performance while outperforming recent deep learning approaches[iii].” I wonder whether this “Bayesian Learning” method is the secret of our brains in terms of pattern recognizing and learning.


[i] Chandler, Daniel. 2007. Semiotics: The Basics. 2nd ed. Basics (Routledge (Firm)). London ; New York: Routledge.

[ii] Jackendoff, Ray. 2002. Foundations of Language: Brain, Meaning, Grammar, Evolution. OUP Oxford.

[iii]Lake, Brenden M., Ruslan Salakhutdinov, and Joshua B. Tenenbaum. 2015. “Human-Level Concept Learning through Probabilistic Program Induction.” Science 350 (6266): 1332–38. doi:10.1126/science.aab3050.

Music as a Language – Jieshu Wang

“Music is the language of the spirit. It opens the secret of life bringing peace, abolishing strife.”

― Kahlil Gibran

This week’s reading materials provided me a preliminary impression on the field of linguistics and how linguists analyze our languages, which is different from last week’s neuroscience and archaeological perspectives.

In his Foundations of Language, Ray Jackendoff proposed that human language is different from and more complex than other communication systems such as the sound of whales and birds because human utterance can pass on unlimited information with unlimited and arbitrary forms but from limited rules and mental lexicon[i]. This productivity reminds me of music, which I think in some sense is analogous to language.

Phoneme of music

First of all, music is the art of sound, so every piece of music consists of sequences of basic elements of sound, corresponding with the phoneme in linguistics. But the scope of music instruments is much broader than language. Basically, any sound within the range of human hearing, even the sound of rain can be weaved into music. A piece of iron and a wooden box can be used to make music.

Music has structural rules

Like syntax and phonology of language, these sounds are integrated together following structural principles or rules, to form larger components, such as a beat, a bar, a section, and then a movement, according to their rhythm, tempo and time signature. For example, in a piece with 3/4 time signature and 120 bpm, the rule is that a bar is composed of three quarter notes, each of which represents a beat and lasts 0.5 seconds. Within a bar, the three beats generally follow a STRONG-weak-weak pattern.

There are many other rules or patterns. Specific rules are used for constructing a C major, a B minor or a fugue. In addition, many pop songs follow several chord progressions, of which the most common one is 1-6-4-5 chord progression. You can hear this chord progression over and over again in pop music. Sometimes you can even match the lyrics of a pop song to the accompaniment of another pop song using the same progression without any disharmony.


Dialect in music

People from different places may speak different dialects, even different languages. So does music. There are a lot of genres in music, each of which has their own unique rules or fingerprints, such as the highly recognizable blues chords and progression. I found surprisingly that if I’m playing on a blues scale that intentionally alters some pitches from a conventional scale, even if I’m just messing around, the noise I made sounds exactly bluesy.

Semantics of music

Language has meanings. So does music, although it is not as explicit and specific as language. One unique property of music is that it can convey intelligible emotion. Therefore, people who speak different languages can share a similar understanding for a piece of music. For example, Also Sprach Zarathustra is one of my favorite symphonies, but I know nothing about German, the mother tongue of Richard Strauss. Likewise, you don’t need to learn Maori’s language to feel the fearlessness in their battle songs.

The similarities of music and language can be enumerated continually. I think the reason is that music and language are both human symbolic systems that are used to represent abstract meanings. As Jackendoff put it, we can create and understand unlimited utterances. It is true for music, too. Music is a universal language of human kind.


[i] Jackendoff, Ray. 2002. Foundations of Language: Brain, Meaning, Grammar, Evolution. OUP Oxford.