Author Archives: Chen Shen

WeChat, in a system design perspective

Chen Shen

Abstract Ever since its debut in 2011, the Chinese messenger app WeChat promptly evolved to one of the largest social networks worldwide. Moreover, WeChat integrated many key functions, successfully eliminating the users’ need to switch to other apps. While its counterparts in other parts of the world developed the specialty in the corresponding fields, WeChat evolved in its generality to support all kinds of plug-in-like apps. Especially after it supported mobile payment, and Chinese government began to cooperate with WeChat and used it as a portal to public services, WeChat began to play a core of Chinese mobile life. The highly integrated mobile environment made profound impacts on the Chinese users as well as society as a whole. And by integrating social network, media, business, advertisement, public service, this platform is creating possibilities that unimaginable by other apps even other societies. This paper will analyze WeChat in a system design perspective, discuss the dependence, affordance, and emergence of WeChat, providing a non-determinist way to understand WeChat’s prevalence in China.

What is WeChat? While Chinese young people build their online life upon it, people from other parts of the world have hardly heard of it. In short, WeChat is a Chinese social media app that integrates lots of the core functions of popular everyday apps. But after we examine the system design and underlying structures of WeChat with the concepts and paradigms gained in this course, we may find WeChat is much more than that.

 Introduction to WeChat

WeChat started off as an IM (Instant Messaging) app in 2011. Currently, it is the dominating social media and IM app in China. By MAU (monthly active user), WeChat ranked No.4 worldwide, only after the Facebook series.


Figure 1. Monthly active users of selected social networks and messaging services. Image from

WeChat was 6 years younger than Facebook (counting from Facebook’s open to public registration which is 2005), and it was targeted for the China market only at first. Since WeChat is a completely mobile app which does not have a corresponding website, we can compare the mobile MAU of Facebook and WeChat and come to notice that WeChat’s MAU increased at an even more rapid rate than that of Facebook.


Figure 2. Number of mobile monthly active Facebook users worldwide from 1st quarter 2009 to 3rd quarter 2016 (in millions). Image from


Figure 3. Number of monthly active WeChat users from 2nd quarter 2010 to 3rd quarter 2016 (in millions). Image from

Then we take a quick look at the WeChat interface (we will talk about the functions later in the paper). It’s fairly easy to register a new WeChat account (we encourage you to do that right away and get a better understanding of it), the app markets of major smartphone OS all provide free download). One can create a WeChat account using the QQ account (a Chinese PC-based IM software by the same company, showed in Fig.1 and has 650 million MAU), or with a mobile phone number. Once logged in, the first step is to add contacts. It is easy to import QQ contacts and mobile contacts in batch, keeping the existing contacts alive on the new platform. And there are ways to add new contacts, one of the easiest ways is by scanning QR code (we will discuss QR code in detail later). Every account has a unique QR code, one can press the “+” on the top right corner and then press “Scan QR Code” (e.g. this one, the author’s account) to send a friend request. If the other user confirms it, the two can start chatting. The whole interface design is clearly for mobile use, with big icons, no intense text, and all buttons gathering in the right/bottom part of the screen for one hand navigation.

img_2949 img_2942

Figure 4,5. The download, scan QR Code of WeChat

The chatting part is not that different from WhatsApp, or Messenger. The majority of the screen is dedicated to messages, with four icons listed at the bottom, which are Text/Voice Switch, Input Window, Emoji, and Attach.


Figure 6,7. The chat interface, and layout of WeChat

Back to the main page, right next to Chats are Contacts, Discover, and Me. Contacts are mainly for contacts managing. Discover has the Moments function that enables users to share photos and browse friends’ Moments. Me serves as the setting of WeChat. Four clear-cut pages distinguish different interaction scenario, with the highest menu-depth of three, meaning that users can navigate to any function within three clicks.

The interface of WeChat by far appears straightforward. Before we proceed, here are some data about WeChat:

  • Daily active user improved 64% in 2015
  • 25% of the WeChat users open this app more than 30 times daily (2015)
  • In the first quarter of 2016, WeChat generated 1.8 billion online revenue
  • During the Spring Festival of 2016, WeChat users sent and received “Red Packets” (celebration message with digital cash) in a total of 32.1 billion times
  • Adult users read articles on WeChat for 40 minutes daily on average (2015)
  • WeChat has portals to 85 thousand mobile apps

Why is such a plain looking app so powerful? What is the underlying power that made it the fourth biggest social media network in the world? In the following sections, we will discuss three aspects of WeChat: dependence, affordance, and emergence.

Dependence. What made WeChat possible

I will call this mechanism evolution by combination, or more succinctly, combinatorial evolution.                                                                                                                                             —W. Brian Arthur

As Arthur put it In The Nature of Technology, “Novel technologies must somehow arise by combination of existing technologies”, we can see the same mechanism in both the birth and growth of WeChat. Because WeChat itself came with no novel functions but a new way of combing existing technologies at the time, in a way that enables new synergies between the elements.

Before WeChat’s launch, Blackberry users enjoyed an IM app called BlackBerry Messenger, it had all the functions that WeChat 1.0 had, except for the constraint that BBM can only be used on BlackBerry phones, which accounted for only 16% of the global market. The platform limited the widespread potential of BBM.


Figure 8. Global market share held by smartphone operating systems. Image from

Then in the second half of 2010, an app named Kik launched. Kik supported all the basic messenger functions, and it supported adding friends directly from mobile contacts. Unlike BMM, Kik fully afforded cross-platform communication. The downloads skyrocketed to over a million within two weeks of release. Due to the splendid performance and incomparable edge, Kik was banned on the RIM platform (for BlackBerry).

Three months later, Talkbox launched with the ability of “Push to Talk”. But it didn’t share the capability of multi-platform at the start. As a result, BBM, Kik, and Talkbox, which were combinations of existing technologies themselves, all had advantages and disadvantages respectively.

In the beginning of 2011 WeChat 1.0 launched. Comparing to the current version, 1.x could be only labeled as minimalism, but it “inherited” both Kik’s cross-platform compatibility and Talkbox’s Push-to-Talk versatility. Comparing to the foreign competitors, WeChat had the unique advantage of the gargantuan user base of QQ, which was produced by the same company as WeChat. Because of that, WeChat can seamlessly inherit a user base of more than half a billion and enjoy a huge starting edge against its domestic counterparts. As WeChat evolved with the ability to import mobile contacts in batch, everyone the user actually knows in person is within the reach of WeChat.

These are the technology dependences of WeChat, it is fair to acknowledge WeChat didn’t bring novel technologies per se, yet the way to unite existing elements and create a new environment is also a kind of innovation.

Comparing to traditional social networks like Facebook, Pinterest, WeChat is different due to its hardware dependence. WeChat is “mobile native”, rather than “mobile migrate”. From the beginning of WeChat, it was a smartphone app, meaning that everyone use WeChat meets a hardware requirement list: speaker, microphone, GPS, camera, etc. As a result, WeChat can just assume every single user has full access to voice messages etc., which is a great edge against website-based messenger app. For example, if a Messenger user sends a voice message, the receiver gets the notification on the Facebook website when he is using a public computer with no speaker, then the data is transmitted but the information not delivered.

WeChat’s rapid rise also has its historical dependence. Unlike most developed countries, in the age of landlines, voicemail was not effectively popularized in China. Many reasons were behind this technical malformation, for example, the relatively short spell between the popularization of landline and mobile, and landline service was largely monopolized by some nation-owned companies at the time who lack the motivation to popularize new services. The result was the Chinese society gradually built up a huge hankering for voice message service. Then WeChat played the outlet of this huge affection need, facilitating the spreading.

We can also talk about the social and cultural dependence of WeChat. The design of every successful software cannot be truly universal. It must correctly adapt to the culture context of the targeting market. Yet for the nuances in apps or services, people tend to compare them in the spirit of absolutism. But sometimes it is wrong to regard them as advantage or disadvantage, but rather an active choice. For example, many messenger apps have the function to use specific symbols to indicate the current status of a message user sent. For Facebook Messenger, one knows if the recipient reads the message. For WhatsApp, they even add three different indicators to communicate more reliably and effectively.


Figure 8. Status indicators of WhatsApp

From a software development perspective, this function is very easy to add, but WeChat never adapts to such a method, because of the social and cultural context of its main target market. In China, messages created by such apps are filtered and censored. If WeChat has a sent check, there is the possibility a message is blocked instead of lost. If the sender tries to resend multiple times and still cannot send the message, he may realize it is been filtered which is undesired by neither WeChat nor the government.  So the sent indicator is incompatible with the Chinese social context. Instead, WeChat embraces the method to indicate a sending failure when it is due to network connections and prompt user to resend.


Figure 9,10. How certain words are filtered in WeChat


Figure 11. Resend indicator in WeChat

As regards to the read check, it is more of a cultural difference. China has long been regarded as a nepotism society, in which a declared ignoring is very aggressive and shameful for both parties of the conversation. Instead, if one reads a message and chooses not to reply,  both sides would avoid the loss of face which can be a much greater issue than the message itself. This psychology deeply rooted in the China. Comparing to the Sun-Apollo worshiping western culture, eastern cultures are more Moon-oriented, which emphasize the value of vagueness and ambiguity, to the extent they are regarded as aesthetic objects. Many studies have done about this cultural feature and we do not have to discuss it in depth. But one explanation by Hayao Kawai in Japanese Psyche: Major Motifs in the Fairy Tales of Japan can be particularly helpful to understand the eastern spirit: “nothing has happened” wherein nothing is interpreted as a special subject rather than null. When A ignores a message B sent, he actually replies a message “nothing”, leaving B in the ambiguity (which is a good thing) to interpret the situation as a superposition state of either being blocked or being ignored. In conclusion, the read check is extremely unsuitable for eastern cultures and WeChat’s lack of status indicators are actually by design.

In this part, we examine the dependences of WeChat from technical, historical, social, and cultural perspectives. There are other dependences of WeChat as well, but we can already see clearly there is no room for a determinism explanation for WeChat’s success, which is practically a combination of functions, constraints, compromises, and contexts.

Affordance. What made WeChat magical

Human brains and computers will be coupled together very tightly; the resulting partnership will think as no brain has ever thought and process in a way not approached by information handling machines today.                                                                                                 —J.C.R. Licklider

When talking about WeChat’s affordance, we can divide the subject into two sections, affordance for developers, and for users.

The greatest thing about WeChat may be the integration. From the last section, we can see that WeChat started rather simple, with no extraordinary function or service. But from Version 1.0 to the current 6.5.1, WeChat kept integrating useful functions into the platform. As a result of this consistent evolving, WeChat is now called “an App to rule them all” in China. While in U.S. one may need a dozen of apps for daily life, in China WeChat alone is sufficient.



Figure 12,13,14. The function integration and comparison of WeChat

The reason and logic behind this are the open API structure. For any novel app in China, functions aside it cannot compete with the dominating user base of WeChat. If users can access the service via WeChat, it means millions of user influx. We can liken the platform effect of WeChat to web portals when WWW was at its early stages, main portals like AOL provide access to other contents which made them popular. When WeChat opened the API for other apps, many third party developers began to provide additional features to the already magnificent complex. And many successful app developers believed in a better future if their products have a daemon instance on WeChat platform. Thus the integration of WeChat began. For example, Group Buy, the Chinese version of Yelp, number one of this market in China, had its own website and mobile app for long. But in late versions, WeChat and Group Buy carried out cooperation in depth and added a Group Buy portal in WeChat. Through the portal, WeChat users can gain access to Group Buy functions without leaving the WeChat platform, even without the need to install the Group Buy app in the first place. It means Group But potentially share the vast user base of WeChat, which can be a win-win situation for both parties.


Figure 15. WeChat official API web page

Group Buy was already influential and famous before the grafting, an overlord of its own market. Yet it cannot resist the prospect of cooperating with WeChat. The similar “immigration” happened to other leading apps as well: Didi (Chinese Uber), 58 (the leading housekeeping app in China), Meituan (a leading take-out service in China)  successively joined the league and kept expanding the platform.

For those apps less famous, the motivation can be even stronger: once the portal is established, they immediately accomplish the transition from a start-up app to an industry leader. This phenomenon happened many times in WeChat Games, the game platform of WeChat. In WeChat Games the majority of games are not developed by WeChat. But once a game is integrated into the WeChat platform, it immediately has users, payment method, promotion platform, multiplayer cooperation/competition platform, etc. In a game purism perspective, many of the popular games in WeChat Games are rather dull, with low graphic performance and monotonous game mechanism. But WeChat turned them into social network games which serve as a totally different role for the users. It is fair to say by integration, WeChat is reforming the landscapes in many app fields. 

Of all the capabilities WeChat has integrated over the years, the payment is particularly a game-changer. It also happened in version 5.0. Once the users bound a bank account to WeChat, the app turns into an online transaction platform. Once again, it was no novel function to support online transaction for a mobile app, in China Alibaba had Alipay for this function long before WeChat. But by combining the user base and the portal to other apps, WeChatPay created a new payment environment.

By the time of WeChatPay launched, Ali already occupied more than half of Chinese online payment. The secret weapon of Alipay is Taobao and Tmall (Chinese biggest online commerce platform), the dominating e-commerce platforms of China. As WeChat is for QQ, Alipay is the natural extension on mobile terminals of Taobao and Tmall. But this was also the limit to Alipay, it is more of an extension of traditional online payment for online shopping, not creating a new model of paying. Both Taobao and Tmall are physical commodities based platform, so WeChatPay seized the service based transactions market where no unified payment platform monopolized before. It also came with the innovation to send “red packets” to other individual or groups with digital cash in it, to create a whole new model of transactions. The amount happened in both service based market and red packets were both small comparing to physical commodities,  but by doing this WeChatPay effectively foster the users’ habit of paying within WeChat, and then exploited the habit to other areas.


Figure 16. WeChat Pay functions usage percentage. Image from 

Taobao achieved the total sales of 15 billion dollars in a single day on 11 Nov. 2016, the Bachelor’s Day, and more than 80% of that was done on mobile phones, how can WeChat compete with that? One thing WeChat is trying is integrating one of the biggest online supermarket of China, JD, into the platform. As a result, users can directly buy everyday things in WeChat. Comparing to Taobao and Tmall, JD had much fewer choices when it comes to the types of merchandise, we can liken JD to Target while Taobao more like eBay. The Chinese name of Taobao has the meaning of “treasure hunting”, in which a vast of choices are available if you are good at hunting. But in the mobile context, it can be harmful as well. A typical treasure hunting scenario on desktop involved longer time, comparing between commodities (different pages), and bargaining with the seller. But the mobile context affords none of these. So mobile buyers tend to buy well-known commodities, they care about quality over variety, they choose familiarity over novelty. In this case, JD’s fit perfectly into the slot. As a unified supermarket, JD had better quality management over the commodities than Taobao, but with much fewer choices. Together with WeChat, they provide an easy model of purchasing wherein customers buy daily consumables without much of choosing.


Figure 17. WeChat Pay users and main purchase categories. Image from 

By fostering new mobile payment model like red packets, new payment environment like service based market, and integrating supermarket like JD, as well as blocking the portal to Taobao, WeChatPay’s user rate doubled from 2015 to 2016. And with the high popularized rate and payment ability of WeChat, the government is using it as an interface to smart city. In many cities in China, like Beijing, Guangzhou, Shanghai,  Wuhan, users can access mobile public services via the WeChat portal. For example, Beijing WeChat users can pay the utility bill, pay the traffic fines, make an appointment in hospitals, check out a book, conduct visa services, and many other public services within the WeChat platform. The list is rapidly expanding, as well as supported cities in China. And because this kind of service is mainly done with webpage-based technology, which is very easy to develop and maintain, they act like optional plug-ins for WeChat, making it even more flexible and extensible. It is no longer science fiction that one can get access to all the services, both public and commercial, with portals enabled by WeChat.

When talking about WeChat’s affordance, we cannot ignore QR Code. QR is the acronym of Quick Response, it is a technology originated in Japan during the 1990s. It is a label generated by the algorithm to be optical read and decoded. It can easily encode complicated text information (1850 characters) into a small label attached to other things. Because of the high redundancy in QR encoding algorithm, when the surface suffers no more than 30% damage it is still readable, making it extremely suitable for printed outdoor situation. Nowadays, China is among the countries which best integrate these technologies into the society. And WeChat is compatible with it from the start.

Every WeChat user has their own unique QR code, so a typical scenario when two people meet and want to exchange contact information is one provide the QR while the other scan. In an instance, a friend request is sent and connection established. But QR in China is much more than that. In the following pictures, the first one is British Embassy in China, the second one is a sweet potato vendor. These two poles is a quick demonstration of the QR craze in China.

main-qimg-203d36ec9028f1724052d381bc95a985 mp55778135_1453361975703_6

Figure 18,19. The QR code used by British Embassy, and by sweet potato vendor

U.S also used QR for a little while but it didn’t prevail, there are some reasons behind that.

  1. Lack of technical dependence. When QR was introduced into U.S, no standard reader was available. Neither Android nor iPhone had the reader in-build. One had to install additional apps that can only read QR to retrieve the information.
  2. Lack of universal portal. Even users read a QR with the special reader, the most thing they can do is to access a URL, which is not a big leap from traditional text-based information.
  3. Lack of regulation. QR was introduced to U.S in its early stages, where the standard protocol was not fully fulfilled. Many custom-made QR failed to generate a universally readable information.

Comparing to that, when we analyze the reasons why QR is so prevalent in China, we can find other historical and social reasons. The first is traditional QR leads to a URL, which is a string of Latin letters. To the English world, the URL itself is symbolically meaningful, but not for Chinese. Especially for the vast population with no English literacy. As a result, any method that can automatically translate information to URL is crucial and easily popularized. Another key reason is the timing. QR entered China when O2O model was on the rise, individual retailers have the greatest motivation to propagate their product or promotion information through this way. So in China, the popular of QR was not driven by WeChat or any other tech giants, but by numerous retailers trying to use new technology to boost their sales.

From a technical perspective, QR Code is simple and outdated, but from a sociotechnical perspective,  a simple technology infused by self-motivated individuals can achieve much higher than it seems to afford.

And QR profoundly expands the possibilities of WeChat. With a built-in reader, WeChat can decode all kinds of information embedded in the code, be it a URL leading to ticket sale, a transaction indicator leading to a purchase, a contact information that users follow, or a verification code for the user to log into a system, or just some text information for the users to read, WeChat can process them all, in the blink of an eye. In these part, we analyzed some major affordance enabled by WeChat. With the application portal, the payment method, and QR reading ability, WeChat user can process both online and offline information and services, penetrating the traditional barriers between different economy modes. They also get the ability to share all formats of media through the “share to” function of WeChat. In the next part, we will examine the emergent features of the WeChat society.

Emergence. What made WeChat phenomenal

We shape our tools and thereafter our tools shape us.                                     — Marshall McLuhan

By emergence, we mean collective behavior or properties of a system that can not be deduced by analyzing the constituent parts of the system. Emergence is the result of synergies, rather than bundling up of the elements within a system. Though in this paper we focus on the system design of WeChat, but emergence is rarely by direct design. As we see in the example of QR code in China, it is not WeChat who invents and promotes all the innovative and pervasive uses of QR code, but the decentralized self-motivated agents equipped with QR generating and reading abilities enabled by WeChat,  co-create a QR-omnipresent China. By this logic, in the complex sociotechnical system of all the users, companies, apps, government sectors, WeChat gradually begins to play the role of an enabler.

This can be a new stage for a mobile app at which it really begins to impact the society, not by providing functions for individuals to use, but providing a sociotechnical context for individual to exploit and co-evolve with the platform.

In this way, WeChat reshapes China in many visibly and invisibly. This is not the focus of the paper so we will not delve into details. But from a system design perspective, it is necessary and important to see the potentials when a designed system evolved into a decentralized adaptive system.

WeChat reshapes customs.

In China, the Spring Festival custom is one of the most consistent ones. People go back to their hometown at this time of the year for the celebration and family reunion. This is the time every year that big cities seem evacuated. In the past thirty years, the CCTV Spring Festival Gala is the core of traditional family activity. Members of the family gather together to watch TV until midnight comes and officially proceed into a new year. This tradition has been so long and so stable even in the age of new media TV are constantly losing its appeal. But even this massively collective tradition is changed during the past three years.

With the Red Packet function of WeChat, it is easy to send digital cash to friends with greeting words, and one can send red packets into groups. For example, you can seal 200 CNY into a red packet with ten parts and send it to a group of 20. Then it is more like a game of gamble, each member of the group who saw the message can open the packet and get a random amount of money from 1 cent. And the first 10 people open the red packet will carve up the 200 CNY. Traditionally, elders should give younger generation cash in the hope of a bliss coming year at this time of year, so the red packet smoothly blend into the custom by providing an alternative way for Chinese to give blessing cash to others. Especially in this time of year one is supposed to spend with the family, so this is a good way to stay connected with friends. But the gambling nature of group red packet slowly transforms into a game. Wherein group members send a packet with small amounts and the one gets the highest/lowest part continue the procedure. Some companies are also using this chance to send out large bonuses. As a result, everyone is in the close monitor of the smartphone, in fear of miss a large packet.

In 2014, the first Spring Festival this new tradition emerges, 5 million people take part in the game. In 2016, it was 516 million, nearly half of Chinese population. People sent red and received red packets 32.1 billion time, which is ten times more than 2015.

To further the trend, technical giants cooperate with CCTV or other TV stations, to add QR code into the gala. So during certain points of the show, if one shakes the phone, there is a chance he gets all kind of bonus, be it cash, tickets, coupon, collectibles. In the peak time of Spring Festival Gala, there were 810 million phones shaking in one minute. Even if each user are shaking two phones (which is very common in that context) at the same time, there were 400 million people shaking phones at the same time. Simply magnificent.

One can try to imagine the figure continue to grow in the coming Spring Festival. But no matter what the figure is, a whole new civil custom broke out in 2 years with an emerging business of billions.

WeChat reshapes the economy.

Online commerce platforms like Taobao is the dominating form in China. But during the years a new form of e-commerce is rising, called WeChat Business. Unlike Taobao, wherein the owner should practically maintain a web page about the online store, the WeChat Business owner only has to take pictures of the products and write some short introductions and send to groups he is in.

WeChat business has an even lower threshold and focuses more on user experience than propaganda. It also exploits individual credit in the field which is seriously missing before. And the cash flow is more fluent than Taobao because in Alipay a third party supervision will hold on to the money until a purchase is successfully finished in the protection of consumers.

This trend is encouraging more and more young people to start their own business rather than finding a job. There are absolutely shortcomings in the trend, but the stimulation for innovation is undeniable. With an easier approach to connect and interact with the customers and make transactions, Chinese young generation avidly embrace this entrepreneur wave.

The trend is not limited to business. Creating and maintaining an official account is more and more common nowadays. Due to the ability to push one article every day, many journalists, bloggers, designers, photographers, writers take WeChat as the staging area and try to build up their own fan community. A young girl I know in China, 21 years ago, has been writing in her official account for one year. Lately, she published a book and already ranked fifth in Chinese sales for youth literature.  This is unimaginable before WeChat came along.

More and more ambitious individuals, like her, like the old farmer selling potatoes with QR code, are seizing the opportunity for self-fulfillment, leading to an effervescent social environment. WeChat cannot take full credit for that, but it is the infrastructure beyond which these dreams are built.

WeChat reshapes society.

WeChat reshaped society in many different ways, here we only take a quick look at the moments. In moments users can share pictures, texts, or link to articles to his friends, or he can forward an article he likes. As a result, good articles quickly get to circulate among users. China is a society with relatively low social engagement, partly due to the practical ideology of the time, partly due to a low credibility of official media. So many people, especially the young and liberal ones, prefer WeChat articles than editorials. By forwarding such articles, different opinions and insights are circulated quickly and widely. There is still censorship for that, but the post-censor mechanism still enables the articles to been read by many. Thus it is a solid step towards a more open society.


The best way to predict the future is to invent it.                                                                  —Alan Kay

Though Chinese technology companies have long been accused as copycats, WeChat is something unprecedented in other parts of the world. By integrating popular functions, providing a portal to other plug-in apps, and merging the online-offline network of the individual user, WeChat is practically practicing a new kind of application, or rather, a mobile platform. It may present the future trend of social network and mobile apps, even a possibility of future digital presence. With the profound social influence of WeChat, China is experiencing a kind of integrated mobile life not yet experienced here in the U.S. And the young generation is using it as a platform to make more innovations thanks to a more flattening social structure enabled and connected by WeChat.

The rise and success of WeChat have its unique social and historical dependence which may not be repeated, but the rest of the world should pay attention to this emerging app, as well as the emerging social momentum empowered by it. It might as well be the next Facebook, or something bigger.


Abelson, H., Ledeen, K., & Lewis, H. (2008). Blown to Bits: Your Life, Liberty, and Happiness After the Digital Explosion (1 edition). Upper Saddle River, NJ: Addison-Wesley Professional.

Arthur, W. B. (2011). The Nature of Technology: What It Is and How It Evolves (Reprint edition). New York: Free Press.

Baldwin, C. Y., & Clark, K. B. (2000). Design Rules, Vol. 1: The Power of Modularity (4th Printing edition). Cambridge, Mass: The MIT Press.

Berners-Lee, T. (2000). Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web (1 edition). San Francisco: HarperBusiness.

Chen, S., & He, W. (2014). Study on Knowledge Propagation in Complex Networks Based on Preferences, Taking Wechat as Example. Abstract and Applied Analysis, 2014, e543734.

Collins, H., & Kusch, M. (1999). The Shape of Actions: What Humans and Machines Can Do. MIT Press.

Davis, M. (2001). Engines of Logic: Mathematicians and the Origin of the Computer (Reprint edition). New York: W. W. Norton & Company.

Deacon, T. W. (1998). The Symbolic Species: The Co-evolution of Language and the Brain. New York: W. W. Norton & Company.

Denning, P. J., Martell, C. H., & Cerf, V. (2015). Great Principles of Computing. Cambridge, Massachusetts: The MIT Press.


Gleick, J. (2012). The Information: A History, A Theory, A Flood (2.5.2012 edition). New York: Vintage.

Kawai, H. (1998). Japanese Psyche: Major Motifs in the Fairy Tales of Japan. Woodstock, Conn: Spring Publications.

Latour, B. (1999). Pandora’s Hope: Essays on the Reality of Science Studies (1 edition). Cambridge, Mass: Harvard University Press.

Lidwell, W., Holden, K., & Butler, J. (2010). Universal Principles of Design, Revised and Updated: 125 Ways to Enhance Usability, Influence Perception, Increase Appeal, Make Better Design Decisions, and Teach through Design (Second Edition, Revised and Updated edition). Beverly, Mass.: Rockport Publishers.

Lien, C. H., & Cao, Y. (2014). Examining WeChat users’ motivations, trust, attitudes, and positive word-of-mouth: Evidence from China. Computers in Human Behavior, 41, 104–111.

Manovich, L. (2013). Software Takes Command (INT edition). New York ; London: Bloomsbury Academic.

McLuhan, M., & Gordon, W. T. (2003). Understanding Media: The Extensions of Man: Critical Edition (Critical edition). Corte Madera, CA: Gingko Press.

Murray, J. H. (2011). Inventing the Medium: Principles of Interaction Design as a Cultural Practice (1st edition). Cambridge, Mass: The MIT Press.

Norman, D. (2013). The Design of Everyday Things: Revised and Expanded Edition (Rev Exp edition). New York, New York: Basic Books.

Norman, D. A. (2010). Living with Complexity. Cambridge, Mass: The MIT Press.

Norman-Cognitive-Artifacts.pdf. (n.d.). Retrieved September 27, 2016, from

Peng, X., Zhao, Y. (Chris), & Zhu, Q. (2016). Investigating user switching intention for mobile instant messaging application: Taking WeChat as an example. Computers in Human Behavior, 64, 206–216.

Rammert, W. (2008). Where the action is: distributed agency between humans, machines, and programs. Berlin. Retrieved from

Russell, J. (n.d.). WeChat, China’s top messaging app, no longer tells users when it censors their messages. Retrieved from

Vermaas, P., Kroes, P., Franssen, M., Poel, I. van de, & Houkes, W. (2011). A Philosophy of Technology: From Technical Artefacts to Sociotechnical Systems. San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA): Morgan & Claypool Publishers.

Wang, X., & Gu, B. (2016). The Communication Design of WeChat: Ideological As Well As Technical Aspects of Social Media. Commun. Des. Q. Rev, 4(1), 23–35.

Wang, Y., Fang, W.-C., Han, J., & Chen, N.-S. (2016). Exploring the affordances of WeChat for facilitating teaching, social and cognitive presence in semi-synchronous language exchange. Australasian Journal of Educational Technology.

Wardrip-Fruin, N., & Montfort, N. (Eds.). (2003). The new media reader. Cambridge, Mass: MIT Press.

WeChat: China’s Integrated Internet User Experience. (n.d.). Retrieved December 15, 2016, from

Wen, Z., Geng, X., & Ye, Y. (2016). Does the Use of WeChat Lead to Subjective Well-Being?: The Effect of Use Intensity and Motivations. Cyberpsychology, Behavior, and Social Networking, 19(10), 587–592.

Xu, J., Kang, Q., Song, Z., & Clarke, C. P. (2015). Applications of Mobile Social Media: WeChat Among Academic Libraries in China. The Journal of Academic Librarianship, 41(1), 21–30.

Zhongwei, L., Hao, J., & Yangfan, X. (2015). Tencent WeChat’s Micro-Innovation of Integration and Iteration under Technical Paradigm Transformation *. China Economist, 10(5), 106–122.

Zittrain, J. (2009). The Future of the Internet–And How to Stop It. New Haven, Conn.: Yale University Press.


Outline of the Final Project (Chen)

Final Paper Outline

WeChat (working title)

1 What is WeChat

A brief history and current statistics

Introduction to interface and functions

De-blackboxing the design and analyze how the designs provide affordance to the functions

Analyze key design principles

  • Modularity
  • Combinatoriality
  • Hierarchy
  • Abstraction


2 How people use WeChat

Introduction to typical use of WeChat

Interviews as qualitative method, focusing on how different age/culture groups use WeChat

Analyze key usage

  • Messenger / voice message / online phone / video chat / voice-text converter
  • Group
  • Social network / Contacts management / Friends hunting
  • File sharing & transmission / QR reader / Bar code reader
  • Official Accounts / Mobile reader/ Marketing
  • eWallet / Transaction platform / Red Envelope / tickets and coupons
  • Search engine / API / Shopping, gaming


3 When WeChat

Introduction to the mediation of WeChat on user

Cognitive artifact

Influence on people’s expression, case study WeChat emojis


Influence how media are made, to suit the spread model of WeChat

A new agent emerge

Influence on people behavior

4 How it does these

Analyze how WeChat integrate and coevolve with existing systems or nets

  • Internet
  • Telecommunication net
  • Other app functions enabled by open system
  • Nets of things


5 Social impact

As a sociotechnical system, how WeChat impact Chinese society

  • Social inclusion
  • Social justice
  • Social stratification
  • Social convention
  • Emerging jobs and economy


6 Reasons behind its uprise

Brief introduction to the reasons why WeChat thrive

  • Technical reasons
  • Social reasons
  • Political reasons
  • Global context


7 What is WeChat

Integrated platform to combine people’s everyday needs into one solution and systematically expand the possibilities of online life

WeChat, therefore we are

Chinese software companies have long bear the infamy of copycats: no innovation, low in technical content, only thrive due to the relatively closed software and Internet environment. That’s one of the reasons in China people know Google, Amazon, and Facebook, but not the other way around. Nevertheless, U.S. media intensively reported about one app born in China these years: WeChat. Because this may be the first time in the software world, China is in the lead.

One may doubt the claim. After all, WeChat ranks only fifth in MAU (monthly active users) worldwide. But we need to remember two things. Firstly, comparing to the 12-years-old Facebook, WeChat only launched 5 years ago; secondly, WeChat currently focuses on China only, once it outreaches to the world, a rampant increase is predictable.


So what on earth is WeChat? We can view it in many different ways (in fact, there’re lots of other ways to see WeChat and for every single way we can develop a whole paper)

WeChat as social media

WeChat started off as a social network/messenger app. But unlike other competitors, WeChat had an inborn advantage, it was developed by the same company as QQ, the dominant social network software in China. As a port, you can link your WeChat account to your QQ, so you started the journey of WeChat with lots of QQ friends. And as a mobile app, it was also an interface to your contact information, once you linked your phone number to the WeChat account, you can befriend with those in your phonebook. As a result, WeChat combines two biggest existing network in China: telephone network and social media network. Basically anyone you know, either in real life – then you got his number, or online – then you got his QQ, is one click away to become your WeChat friend.

And WeChat was born to chat. Unlike the mainstream text-based social network of the time, WeChat, as a mobile app, had the access to the microphone and speaker on the cell phone. By simply holding SEND, you can speak and record your message and send it to your friends, a much easier, intimate, and even safer way to communicate. Facebook took years to incorporate Messenger to do the trick, but WeChat correctly exploited the instant voice message demand when it initially launched. Even the newly trend of “say it with a Sticker” was long ago developed and commercialized on WeChat.

Besides your personal contacts, WeChat also has a type of accounts called “media accounts” to which you can subscribe. It’s the equivalent of a blog directly to the users’ cell phone. With compatibility of all media available for a phone, the “media accounts” have great potentials and liberties to develop their own contents. In fact, the two main functions for WeChat users are chatting with friends, and browse the articles generated by “media accounts”. Even a new sector is required in traditional companies to handle their “media accounts” on WeChat due to much more frequent and direct interaction it enables the customers.


WeChat as transaction processor


So WeChat is the Chinese edition of Facebook, one may think. But it is more. Besides QQ account and cell phone number, you can also connect WeChat with your bank account, make WeChat an e-wallet and online transaction interface. And due to the relatively slow speed of bank digitalization in China, most users use WeChat as their primary online bank interface, rather than the app developed by the bank they use. And on WeChat you can create a separate balance account, which draws from your bank account but has high conveniences for cell phone transactions. With the QR technology, you can make instant transactions with this balance account without the need to befriend on WeChat to the other end of the transaction, making it practical and popularized way for mobile payment. And you can draw from your balance without passwords to send to your friend or groups. There’s this custom in China to send friends money sealed in a red envelope on special occasions. WeChat strategically exploited this tradition and promoted campaigns to encourage people to send red envelopes to each other on Spring Festival. 2016 alone, for the first 5 days in Spring Festival, 516 million users (60% of total) sent and received red envelopes 32.1 billion times. And with access to TV shows, by shaking your cell phone during pre-designed points in a TV show can earn you actual money, different companies distributed billions of RMB on Spring Festival 2016 to attract consumers.

It also gives rise to an economic format called as “micro sellers”, combining the “media account” and ability to make transactions, everyone can establish an online store and sell things you own or you have access to. Based on the gigantic number of users, “micro seller” is one of the most popular occupations for young people in China.

And due to the dominant position of WeChat in everyday lives, governments sectors, companies all chose to cooperate with them. You can pay for your credit card, your electric bill, your train tickets, your party fee, and all sorts of payments on WeChat.


WeChat as aggregation of apps

In the U.S, a modern smartphone user would have a lot of “must have” apps on the phone. Facebook for social network, Amazon for online shopping, Uber to get a ride, Instagram to share photos, Skype to make phone calls, Yelp to check customer rating of restaurants, Game Center to play games with friends, and so on. But in China, all you need is WeChat, for it actually integrates all their functions.


But it was not born with all these functions. What WeChat did is a relatively open structure and port to load plug-ins developed by third-party developers. For example, WeChat Game, a built-in game platform. WCG fully demonstrates the potential of integrated gaming platform. You can show your scores or achievements in your status to show off your prowess or promote the game (in fact, a lot of game operator give in-game rewards to users who post game screenshots in their circle, making users cheap and reliable promotion medium). You can interact with the game community on the phone. You can buy virtual assets with the purchasing ability of WeChat. You can enjoy all sorts of promotion brought to you by the “media account” of the game operator. You can make new friends based on game choices or shared game experiences. All these together made gaming a social and commercial activity which can yield much greater profit for the game developer than a stand-alone game. WCG has different gaming ecosystem, where game developers play a relatively insignificant role. Due to the hardware computational power of a phone, limited interaction methods and display area, WCG can hard compete with console games or PC games. But in WCG a player is constantly connected with, or stimulated by, his WeChat friends, and with the ability for instant payment, the in-game virtual avatar, the actual player and the paying potential behind him merge into a new agent on the WCG platform, playing another “game” without him noticing.

There’s no limit to the function WeChat can carry. For example, in China government is using WeChat as terminals of Smart City. Local traffic information, hospital reservations, online educations, policy consulting, certification handling, more and more government services have virtual windows on the user end carried by WeChat.


WeChat as interface to mobile phone

Another noticeable character is WeChat is not a cell phone adaption of a software or web page. It was born mobile. This seemingly disadvantage actually freed WeChat by guaranteeing some basic functions that every user has access to. Imagine that WeChat started as a website, then it can no way fully embrace the voice system because a lot of web users have no microphone or speaker. A voice message without speaker at the other end is literally less than useless: creating the false impression your message is received.

As a result, WeChat can be view as an interface to mobile phones that employs all the hardware functions and characters of cell phone. For example, you can easily send your current position to your friends, it is WeChat reading the geographical information enabled by the built-in GPS system on the phone. You can send packaged web page based on the HTML, PHP, and FLASH decoder installed in the OS. You can shake your WeChat to befriend with those simultaneously shaking their phones, which is applying the functions enabled by gyroscope on the phone. On paying, you can input the password by pressing the finger on home button, this is fulfilled by fingerprint recognition function of the phone. Though the primary OS of the phone also has access to all these hardware functions, but OS fails to present them in an integrated and symbolically meaningful environment. As a result, common users can use WeChat as a surrogate operating system to gain access to all hardware functions they matter. And the open structure of WeChat made it even more possible to hide lower layers of the phone. Imagine that, if all essential application functions are fulfilled by WeChat, it is totally possible for users to use it as smartphone OS.


When I say WeChat is in the lead, I mean it represents the future of mobile apps, and the possibility of next generation cellphone OS. It enjoys a much less international recognition and reputation than Facebook even Instagram, but it’s changing people’s mobile life in field Facebook has yet penetrated. By connecting the main pillars of mobile life: users, applications, hardware, and payment, WeChat creates a more integrated mobile life in China than here in U.S. It actually defined online life for many of the users. WeChat, therefore we are.



On the Internet

Chen Shen

To answer the question “what does it mean to be ‘on the Internet'”, I have to address a more fundamental question: what is the Internet. The term may be hard to define because the answer depends on to which layer we talking about the Internet. But just as the name suggests, the Internet is a network of networks. The interconnection and interaction of these networks gave rise to the Internet. This means the Internet is a system.

One bizarre character about systems is that system itself is not a thing, but an emergence. The Internet is not the sum of computers in it, but what happens when these computers are connected. Since the Internet is not a thing, not an object, it’s clearly intangible. Like an urban traffic system, we can only see its nodes (traffic hubs, central stations, etc.) and edges (road, bridges, ramps, viaducts, etc.), but this they are merely the infrastructure layer of the system. For the system of Internet, even the edges are intangible (in the sense of bit transmission, not in the sense of communication infrastructure).

A universal model for systems is like below. The graph illustrates that a system is a process that turns input into output with a fraction as loss, and both input and output are outside the system.


In this perspective, we can define the Internet by defining the input and output. As Hal Abelson et al mentioned in Blown to Bits: Your Life, Liberty, and Happiness After the Digital Explosion, “The Internet is a system, a delivery service for bits, whatever the bits represent and however they get from on place to another”. In this perspective, the terminal computers are excluded from the Internet. The Internet is the system of all the intercomputer connections, be it wired or wireless, LAN or WAN. In this perspective, the websites we see on the Internet are literally on the Internet. They are data stored in another terminal connected to the Internet and interpreted and interacted by out computers. In this sense, “on the Internet” means the data is accessible by combinatory routes in the computer networks. For me to be “on the Internet”, it means another user can receive/send data to my devices via the Internet. That’s to say, when I went to Shenandoah National Park where there’re no WIFI signals for my smartphone, I am “off the Internet”.


By The Opte Project – Originally from the English Wikipedia; description page is/was here., CC BY 2.5

But there’re other ways to interpret the Internet. Abelson’s definition focus on the IP layer of Internet, if we add more layers to it, things will be different. For example, if we include protocol like VoIP, include programs like Skype, and include infrastructures like a base station, the telecommunication network can be integrated into the Internet. One can transform bits data traveling on the Internet to electromagnet signals traveling in LTE. As a result, even in the Shenandoah recess, I’ll still be “on the Internet”. For me to be “off the Internet”, I have to escape to somewhere no cellphone signal can find me, like a plane. In the sense of Internet as a system of the networks of  computer-based data communication and phone-based sound telecommunication  combined, a passenger in an international plane from D.C. to Beijing is “off the Internet”.

But the Internet can be even more. If we incorporate application layer to the networks, the flight management system relying on Internet connection and distributed cooperation is also a part of Internet. So even when I am thirty thousand feet above the ground and no cell phone signal or WIFI signal can find me, my presence is still recorded and displayed in real time in the FMS, which can be regarded as a subsystem of the Internet. I’m “on the Internet” even in an airplane.

We can push this even further. Some of the sociotechnical systems can make one “on the Internet” without himself knowing it. The ubiquitous surveillance camera system, for example, can reconstruct the presence of a targeted individual to the detail he cannot recall. Not only the presence, due to the recordable nature of data and practically infinite storage for data, our activities in the past are also “on the Internet” and may always be there. It’s not only overt information about when did one sleep last night, but also covert information that one may never consciously notice: a shopping routine, a mood cycle, a color preference, or a sleep time pattern. And because our society has completely entwined with the Internet, from the day an infant is born, her digital footprint begins to accumulate on the Internet. So much so that others can reconstruct a great deal of her life from the data they glean, in this sense, we are constantly “on the Internet”.

By now we can see the Internet can be many things. The only shared property is the Internet is always a system. With the model above, we can even interpret Internet as an extremely complicated circuit that consumes about 3.3 Million Mwh every day; or a communication method that more than 3 billion people shared; or a  mass post service system generating over 200 billion emails every day. The interpretation goes on.

So, the answer to “what does it mean to be ‘on the Internet'” depends on how we define Internet. Same as the content, the width, the connectivity,  the definition of Internet is also expanding, incorporating new systems every day.


Ron White, How Computers Work. 9th ed. Excerpts from chapter, “How the Internet Works.” Que Publishing, 2007

Martin Campbell-Kelly and William Aspray. Computer: A History Of The Information Machine. 3rd ed. Boulder, CO: Westview Press, 2014

Barbara van Schewick, Internet Architecture and Innovation. Cambridge, MA: The MIT Press, 2012

Hal Abelson, Ken Ledeen, and Harry Lewis. Blown to Bits: Your Life, Liberty, and Happiness After the Digital Explosion. Upper Saddle River, NJ: Addison-Wesley, 2008


Chen Shen

This week’s reading focus on digitization, so I’ll try to use the knowledge from the readings to walk through the concept and application of digitization.

First of all, what is digitization? It is a process to use digital devices to represent objects or signals, be it text, image, audio or other forms of information. As the suffix in digitize suggests, this maps a trend worldwide to transform all existing information and media into the digital form.

So why do we do this? There’re lots of reason due to the innate defects of analog signal. For example, they are hard to transmit and operate, impossible to copy without information loss, they degrade during the time, and they are usually more expensive to store than the digital version.

Then how do we digitize a signal? If the object is not time-based, we assign certain digital numbers to all the possible variations in the format, and they present them spatially as how the original signal is arranged spatially. If the object is time-based, like audio, we divide the signal up into fixed intervals and measure all the properties of the signal in that a single time segment and use digital numbers to represent the measurement. By doing this, the whole time sequence is transformed into linear segments with digital numbers to represent analog properties.

The following section will consider closely some of the examples of digitization: texts, image, audio, and video.


Texts may be the easiest media to digitize. It is not time-based, and the building blocks of text are of very limited numbers. Consider a typical typewriter, it’s acceptable to say a typewriter can produce all the texts in modern English.


So, if we assign a number to every possible key (a status of the possible variation) on a typewriter, the text is digitized. There are less than 50 keys, doubled by the SHIFT function, making it less than 100 visible output, which means a 7-digits binary code is adequate to represent any key on a typewriter. Though the string 0100 0001 seems much more complicated than a simple elegant “A”, the digital property makes binary strings extremely easy to store and transmit  for computers than its analog counterparts. Along with the exponential increase of digital storage technology, we have more space to store the digital format of all texts ever made, and all texts going to be made in a foreseeable future. To better represent all possible letters, ASCII is implemented, with redundancy. In ASCII, code 32-126, 95 codes in total, are assigned for printable letters. And printable letters are all that is perceivable by humans. As a result, ASCII coding can digitize all English texts without any information loss. To expand the spectrum to cover other languages, we use UNICODE. In the early stage of UNICODE, more than twenty thousand Chinese characters were included, only making the code length to expand to 25 digits. Right now UNICODE has evolved to 9.0, with a 35-digits-length, enough to cover almost every known character in all languages, even some newly born emojis. Another important reason to digitize text is based on the fact that texts convey its meanings independent of its physical appearances, unlike visual or audio signals, the text itself is already encoded by our language system. No matter what the typeface one is using, the word “text” stimuli almost same reflections in the recipients’ mind. So the texts  digitizing process is almost impeccable.


In visual digitizing, an important principle is shown: no signal channel is in unlimited bandwidth, neither the source of the signal nor human sensory systems. Human perceives an image by reconstructing mental images in the brain from stimulus  to the visual system. Though the visual light has a spectrum with infinite possible variations (we put aside the discrete nature of light determined by photon for now), the human visual system can only distinguish a limited selection of the value and hue of the light signal, defined by the color space.


All color perceivable in the color space is, in fact, a mix of the stimuli to three kinds of cone cells in our eyes. They can respectively sense short, middle, and long wavelengths and our mind combine their level of stimuli to form a color sensation. Due to the limited resolution of any kind of cone cell to distinguish wavelengths within a minimal frequency difference,  we can map 0-255 to the level of stimuli to a single kind of cone cell and use a triple string to represent a color. This is our familiar RGB color system. And mental image is a vast array of points of certain color, so to digitize an image, first, we establish an array of points, which is also called the resolution of a picture, like 1920 x 1080. Then we divide the original image up into this array, measure the color in all the cell, transform them into an RGB number, and store the whole array of number. It’s both our limit to distinguish light’s wavelength and the limited density of visual cells that make the digitization possible. But the digitization of image is much more complicated and troublesome than that of text. Because different display devices (the decoding end) have different color space, the same color, (127,127,127) for example, may look slightly different on different display devices. It’s not just for display but also for capturing (the encoding end). For example, the DSLR camera of Nikon and Canon tend to capture the same object in different hues due to the difference of their COMS/CCD and image processing system. The difference is not obvious, but unlike text, a slight change in the overall hue can place a great sensational difference on the viewer, so all industries related with image representing and reproducing have a rather high standard of color management.



Audio is a time-based media. As our visual system, the audio organs are also of limited bandwidth and resolution. From the knowledge we got from readings weeks ago, we know if the sampling rate is at least twice the highest frequency perceivable. So typical .mp3 files use a 44.1kHz sampling rate.


And a sound wave can be presented by its pitch, duration, loudness, and timbre. In a time segment, we can assign numbers to represent the pitch loudness and timbre. The more digits we use to represent the soundwave segment in a time slice, the more information of the original sound is transformed and stored. In early versions of .mp3, typical bandwidth is 64kbps. When reaching 128 kbps the sound is really as good as one can perceive, and a high quality .mp3 file can excess 320kbps  which is the typical bandwidth for compact discs. Audio digitizing shares the same problem as analog: the process of reproducing sound from signals, either analog or digital, is more complicated than that of displays by visual signals. The AC/DC part, the speaker part, even the listening environment part can all affect the ultimate audio sensation. But this is not the problem of audio digitizing, but of the whole speaker system.


The video is easy to deal with when we already know how to digitize both image and sound. The video is no more than the aggregation of the two. The film industry already demonstrated to us by 24 frames per second, a human would perceive pictures as a continuum. The mind would trick itself to add the time property to discrete pictures.

Shared Properties of Digital Media

Once the signal is digitized, they share some common properties no matter what their original form is. First is the ability to be perfectly copied and transmitted. Binary signals with redundancy can almost eliminate the possibility of a copying error. With this, the term origin and copy loses some of the meaning: once a file is copied (or send out on the net), it’s really impossible and pointless to claim which instance is the original one. Another important thing is they become searchable. This is one of the most important reasons for digitizing. It takes milliseconds to locate a single word in a whole series of the Encyclopedia Britannica. And except for active search wherein human provide keywords, the incredible computing power of computers can also search for patterns in data, finding new links and information that’s totally new to human. The third is being operable. It’s much easier to operate a text file than printed letters on a paper, and digital file can do something totally impossible for the analog counterpart. Like the filters in Photoshop, they can easily change the tone, expression style, blurriness, or contrast of an image, and can do them multiple times with the ability to retrieve the unaltered file. Another interesting thing for a digital file, the function “blur” has nothing to do with the physical process of blur, but an algorithm that changes a set of digits in the file in a certain way and makes the digital image perceived as blur. This is an important property of digital operations, they’re by nature algorithm operating on file by flipping certain digits.


Human perceives analog signals, which means even if the signals are digitized we still have to go through the D-A process for us to perceive them. But even with this, digitizing is important for it grant us with the ability to better copy, store, operate, and transmit information. And a media file in the computer is an illusion. For the computer, a visual file is not that different from a song, a text, a clip. They are all long strings of 0 and 1s with additional digits to label the file format. Without decoders and output devices, an image file has no shared properties with a picture at all.

The difference between analog and digital media, to me, is rather quantitative than qualitative. Because even the analog signal, bonded by nature mechanisms, is still discrete, only in a very unobservable way. Time can no longer be broken into units smaller than Planck time, light no smaller than a photon, so the analog signal  is also discrete signal only much more variations than the binary.     

Visions Unfulfilled

Chen Shen

This week’s reading is quite new to me even though it’s part of the computer history and an amazing one. I felt their ideas should take a more notable position in computer education. Not to exaggerate, the pioneering works done by Bush, Kay, Licklider, Sutherland,  Engelbart et al directedly paved the road to how we interact with computers as well as networks.  One thing strikes me the most is though their works were done in the primitive stage of the computer, their visions were profound and many of them were not yet, after 40 years, fulfilled. All these make you wonder, what if the computer development history chose another route, what will it be like the computer I’m typing these words with, if still typing.

All of the pioneers’ work aim at a similar goal: augmenting human intellect. For Alan Kay, he wanted computers to be used for learning, discovery, and artistic creation; for Licklider, he wanted computers to facilitate formulative thing and help men controlling complex situations; for Engelbart, he wanted computers to “increase the capability of a man to approach a complex problem situation, to gain comprehension to suit particular needs, and to derive solutions to problem”, so on so forth.

After half a century’s rapid development, computers’ computing speed and storage capacity exceed the pioneers’ imaginations: in his paper Man-Computer Symbiosis, Licklider even suggested “we shall not store all the technical and scientific papers in computer memory” to save space and money. But today storage is no longer an issue as long as it’s personal information and knowledge.

But storage alone is no indicator of better intellect, if not worse. With the option to easily unload knowledge and information to external cognitive devices, men tend to remember less, and justified the tenet “knowledge is useless” since all knowledge is just one click away. This is an illusion, of course, no matter how brilliant Google is, it can only search based on keywords one provided, limiting the possible outcomes  within the scope defined by concepts one already know.

Either storage or speed, they’re the material part of a human-computer system. The material part of the system seem to outgrow the visions, but the conceptual  leaps they for are still beyond the horizon.

In Manovish’s Software Takes Command, we see how Kay’s ambition. As a metamedium, the computer is nothing like its precedent media inventions, it’s not a genre or style or format, but a new level of symbol abstraction and information processing. Along with language, writing, the computer plays the part as the third symbolic leap of human history. It has the ability to simulate any existing media, which means it can assimilate them, putting their paradigm under its umbrella. By this ability, computer prevails all other media. But Alan’s aspiration for computer is not just a “universal media player”, it should be used to produce not-yet-invented media, as the prodigy did with Dynabook. We clearly see this vision’s not fulfilled.

But why? There are some reasons I can think of.

The first is the changing role of computer as it became more and more common. At the visionaries’ time, computers were too expensive that computer “connect to one another by wide-band communication lines and to individual users by leased-wire service”, so that “the cost of gigantic memories and the sophisticated programs would be divided by the number of users”. Along with this extreme scarcity, people treated highly their share of time with computers, trying their best to get the most out of it. I still remember back in the 80s our school installed the first computer, people waited in line to try this “magic box” and explore all the possibilities. But ever since the new century, computers became so cheap and common, the ubiquitousness undermined its role. Now few still regard computers as incredible tools to boost personal experience, but mundane technologies as a car. People use computers to accomplish certain goals, just as they drive cars to get to destinations, few would forgo traveling but just drive around to explore what else can you do with a car. Computers are trivialized.

Another reason is similar to the first one, it’s the over-commercialization of computer and the corresponding media market. Being a metamedium, a computer can fulfill all one’s needs to enjoy existing media. With the digitalization trend of all kinds of media, the resources one can enjoy with a computer is literally limitless, rendering the need to invent “new media” moot.

The relatively high threshold of programming is another reason. To fully fulfill Alan and Engelbart’s dream, common users of computer must have a basic knowledge and experience of computer coding because coding is the only way you get a computer really customized, tailored to your personal needs. But computer language started off hardly understandable and scare off lots of users. Programming language had a steeper learning curve than most of other skills: for a motor maneuver, an art skill before you truly master it you can perform it with a tolerable compromise. But for coding, not a single error is tolerable. What makes this worse, is the complex nature of algorithm procedures. In a long series of instructions, when errors occur, the ultimate outcome (if there is outcome at all) is totally unpredictable. That’s to say it’s hard to trace your error and calibrate your code from a “wrong” output. Programming does get a lot easier and natural in the past two decades, but all kinds of easy applicants had already been made and optimized, as a common user of computer, for almost all your whimsy with your computer, there’re off-the-shelf software or applicants you can use.

Another reason hindering the conceptual revolution, especially for our time, is the rise of computer-substitutes. Like iPad and smartphones. They can perform almost the same function as computer in terms of “media player”, and they’re cheap and fast enough to replace computers as one’s personal digital assistance. But by nature, iPad and smartphones are merely tools to simulate existing media, the system’s highly closed and secluded, making it an awful tool to create. Just as the label “consumer electronics” suggests, they’re meant to consume, not to compute. In recent surveys, some countries had the least percentage of youngster  using computer or laptop in 30 years, due to the uprising trend of using consumer electronics in lieu of computer.

All the pioneers’ vision rely on computer literacy and programming literacy of common people, by which standard people in our time are no better than those 50 years ago. One cannot help feeling disappointed seeing half a century ago visionaries’ outstanding conceivement of computer. But sometimes I wonder is it even possible. After all “intellectual improvement” is no human nature. After the discovery of steam, of electricity, only a small amount of humankind tried to employ the new power to extend the possibility of men, while others enjoyed and consumed their innovations and inventions. History repeats.

BTW, the amazing route planning software in Kay’s video was finally fulfilled not very long ago by the app OmniFocus. If you add activities with location context, it would inform you with a map view what you can do around your current location.


Fourth time’s the charm

Chen Shen

This is the fourth time I learned to code.

The first time was about 23 years ago and the language was Logo. It was the best language to start with at that time for its graphic oriented and specially made for kids. I hardly delved into depth with Logo, all I learned then is to draw certain geometry. The concept was rather straightforward: give instruction to a “turtle”: moving, turning, repeating, and your turtle would leave the trace on the screen.  For example, one easy code was

REPEAT 4 [FD 100 RIGHT 90]

It can even be interpreted in a single sentence: forward 100 pixels then turn right for 90 degrees and repeat this all for another 3 time. What do we get? Of course, it’s a square:


But it has the fundamental concepts of programming language, called as Procedural Literacy by Ian Bogost. In his paper Procedural Literacy: Problem Solving with Programming, Systems, & Play, Bogost argued: “more generally, I want to suggest that procedural literacy entails the ability to reconfigure basic concepts and rules to understand and solve problems, not just on computer, but in general”. I surely don’t understand Procedural Literacy then, but it granted kid with a sense of steps and protocols: almost every kid at that age can draw a square at east, but not all can describe how to do it to something that has no idea what a line is, what a right angle is. When you started to describe simple things for you to the computer, you started programming. But it was more. Though computer is stupid enough to fail all geometry tests, it can do a magical thing: repeating, just as shown in that one-line-code. Computers can repeat something as many times as you want, or time allows. And this is actually repeating. Unlike a kid trying to draw a square on a piece of paper which in fact he draw two horizontal line and two vertical lines with opposite directions (you seldom see a kid draw a square as first draw a line then turn the paper a right angle and repeat). With this ability, the stupid turtle can do things impossible for most kids, this:2


And finally, these4

Sacred Geometry, right? Here we can see another key concept: computational emergence. No matter the single square all the breathtaking pictures, their core is but computation principles. And if you recursively repeat that principle, patterns emerge. The Little principle can generate massive things. There’re more things in our lives can be computationally expressed than we care to admit, after all, the forces maintaining the earth spinning around is but a set of easy equations and ourselves a long string of Base pair.


The second time was 16 years ago and the language was C. I didn’t go far that time because I only learn that as a time killer in boring classes. But still I couldn’t help notice: unlike learning a natural language, it is much easier to learn an artificial language as you age. In our recent readings, we knew the quotation language is the mold of your thoughts. But it seems to me it’s only for your native language, other acquired languages are like pointers in C, they redirect to functions already existed in my mind. If the first time was mainly geometric, the second was algebraic, I taught myself easy algorithms,  sort, traversal, and so on. In learning C, I touched another core of programming: functions. Though we already knew functions pretty well in math, but coding functions can do even more. It can operate not only numbers, but also strings, interruption code, addresses, other functions, and even itself.


The third time was when I majored in Computer Science, it was finally a formal and systematic. The language was mainly C and assembly language. Both Logo and C is somehow intuitive, but definitely not assembly. An assembly code looks things like this:


What does it do? Simply calculate and output basic + – * / of your input. So it occurred to me, though they both run on my same computer, there’re things I can do with assembly language that inaccessible by C (in the sense of regular coding, not in the sense of UTM which makes assembly language capable of performing anything other languages is capable of), and vice versa. In this week’s reading, we met the concept morphemes, and it seems to me, different language has different morphemes, and there are meanings you can only assemble by a certain morpheme set in a certain language.  Last week Jessie and I just had this talk about cross-culture translation, for example, the word Litost (in Milan Kundera’s novel The Book of Laughter and Forgetting), which means “a state of torment created by the sudden sight of one’s own misery”.  But unlike the way you can code in different languages to output a same square, no matter what English morphemes the lexicographer used to define Litost, it’s merely an approximation. It means every language is the subset of the whole possible meaning. Then I wonder, are there meanings only accessible by combining morphemes from different language set?

Now with the language Python is my fourth time. Learning from an online platform is way intuitive and visual than from books. Now we can do it in a simulator IDE and start off with real codes. And current IDEs are kind enough to use different colors to signal which words are grammar approved instructions (how many time I checked through all my codes in assembly just to find a typo)



In the screenshots, we can see symbols that mean things, like the “welcome to Python!” in yellow and “Raw Sensor Values” in  dark green. These are symbols interpretable by human but not machine. And we can see symbols that do things, like the “print” in purple and “analogRead” in red. These instructions lead to machine actions both perceptible (print out a string on the screen) and imperceptible (read the voltage in a certain pin and save it to the register) to human.

With no doubt, this is the easiest time. Partly due to previous learning partly due to the advancement in language learning. I have to admit I enjoy programming language learning, not in anticipation of creating some codes or even software by myself, but of a deeper understanding of the computationable world and our minds.

Martin Campbell-Kelly, “Origin of Computing.” Scientific American 301, no. 3 (September 2009)

Jeannette Wing, “Computational Thinking.” Communications of the ACM 49, no. 3 (March 2006)

“Denning-Great-Principles-of-Computing Google Docs. Accessed October 26, 2016.

David Evans, Introduction to Computing: Explorations in Language, Logic, and Machines. Oct. 2011 edition.

“Litost [Lee’ – Toast].” Urban Dictionary. Accessed October 26, 2016.

Readings on Informatin Theory, a joke, a metaphor, and an ouroboros

Chen Shen

This week’s reading unfold from Shannon’s Information Theory and the information paradox. This reminds three things about that. So I write the post in three parts.

A Joke

Shannon used entropy to define the minimum length of a code, and “any shorter code would be ambiguous and could not be uniquely decoded”. It reminds me of a joke I read years ago:

In a bar, three programmers were drinking and chatting. A said, “B7F340Q”, B chuckled and replied “TTX4352” and A yukked. “What are you guys talking about?” asked C, “We developed a system that designate a code to every possible joke, the one A said was about a clumsy thief”, replied B, “and the one I said is about a drinking pope”.

“Interesting!” C exclaimed, “I will give it a try: MT9293CK”

A and B laughed so hard and fell on the floor.

“What’s the joke I tell?”

“You fool,” A puffed, “no joke is designated to that code!”

A joke is a joke. According to Shannon, it’s clearly impossible to name every possible joke with a string. The codes they told is a 7 digits string mixed with roman letters and Arabic numbers, giving a maximum of 367 (about 8*1010) possible permutations. But the interesting part of the joke is the possibility to represent a joke with a much shorter code and this code plays a same hilarious effect on those who can decode. If we are  not going to map all the possible jokes, say only a selected 1,000, the conversation between A and B could totally make sense. To cite Shannon,  “the fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point”. Though a code has much lower H than the joke it represents, which seems to violate Shannon’s law, but here telling the joke-code is, in fact, a collective action. The sender and receiver have to spend quality time encoding the jokes before they can establish this kind of connection. The code itself has no relation whatsoever with the joke before encoding, which means the meaningful part in this communication is not in the transmission of code but in the encoding and decoding process happens in sender and receiver’s mind. Here code plays the same role as animal language we learnt weeks ago, where certain signifier represent certain concepts or things, while the signifier has no syntax structure.

This kind of correlation can be established via any kind of signifier, and not restricted to one’s native language. And the same signifier can play different roles when interpreted in minds from different cultures. In fact, I can think of an example of an expression with both Arabic number and English letter, but neither Indian nor British understand the expression, Chinese do.

The expression is 3Q. It means nothing to an American ear (except for those who interpret it as a collective of IQ EQ and AQ), but every Chinese young people understand it even if they’ve never heard of it. Because in Chinese the Arabic number 3 pronounce as /san/, followed by /kju/, making it a homophone with “thank you”. The semiosis behind this expression intrigues me, because a Chinese doesn’t have to know it ex ante and successfully deduce the meaning, making it different from the one-on-one mapping in animal language, then is this deduction a kind of syntax language behavior?

A Metaphor

Another thing got me thinking about the reading is the comparison between reading The Information: A History, a Theory, a Flood in both English and Chinese edition. Truth to be told I’ve always preferred original works assuming such readings establish a direct link between me and the author and grant me more. For The Information: A History, a Theory, a Flood, I spent three hours for the English edition, and did not completely understand. Then I read the Chinese edition, it cost me only 12 minutes and cleared my former confuse. It’s not a totally fair comparison because if reading for the second time in spite of language is probably going to be easier. But the 15:1 time ratio cannot be simply explained away. So I think about the two different reading by the information model by Shannon. For English edition, the cognitive process is like this:


Reading, by nature, is to stimulate my mind to form the thoughts mirrored in the author’s mind. Writing/reading model is not the only way for this purpose, many different signals can lead to a similar feeling.  A beautiful prose, a faded picture, the melody of one’s childhood lullaby, the flavor of homemade cuisine, all leads to a feeling of nostalgia. But language is, without doubt, the most delicate  and nuanced medium. In this example, my final “gain” by reading this book is

Gain My English = Thought Author * Encoding Author English * (Signal/(Signal + Noise)) * Decoding My English

All factors here are smaller than 1, making the conversion rate Gain My English/ Thought Author definitely smaller than 100%.

For this instance, we can safely suppose English is the author’s native language and Encoding Author English is almost 100%.

And since the text I got is nearly identical to what his wrote (in the sense of text), noise plays an insignificant part and (Signal/(Signal + Noise) is also close to 100%.

Then the conversion rate is simply Gain My English/ Thought AuthorDecoding My English

For Chinese edition, my cognitive process is like this:


My final gain can be represented as:

Gain My Chinese = Thought Author * Encoding Author English * (Signal/(Signal + Noise)) * Decoding Transistor EnglishEncoding Transistor Chinese * (Signal’/(Signal’ + Noise’)) * Decoding My Chinese

Signal’/(Signal’ + Noise’) for printed texts approximates 100%. As a result, the conversion rate is Gain My Chinese/ Thought Author  Decoding Transistor EnglishEncoding Transistor Chinese  * Decoding My Chinese

To compare my result from two different editions, we can simply divide them:

Gain My English / Gain My Chinese =Decoding My English  / (Decoding Transistor EnglishEncoding Transistor Chinese  * Decoding My Chinese)

From this week’s experience that I read at least 10 times faster and no less comprehension in Chinese than in English

Gain My English / Gain My Chinese <1/(10 * (Decoding Transistor EnglishEncoding Transistor Chinese))

Decoding Transistor EnglishEncoding Transistor Chinese is a translator’s translate rate, meaning  that so long as the translator get a translate rate greater than 10%, which is really a low threshold, I get more from reading in Chinese.

So my conclusion here is, if the translator is proficient in both English and the field this article is in, I’d better read the translated edition. But this conclusion relies on the hypothesis noise means little in the transcription of books. For another medium, this may not always be this case. Conversations for example, if I choose to listen to a translator’s version, then I suffer double noise which might impair my relative gain from listening in Chinese.

So for the time being, maybe the best choice for me is to find corresponding Chinese edition if possible and try reading both editions. Making the progress diagram much like a parallel circuit, to use it in a metaphor way, the combined resistance will be smaller than either branch in the parallel circuit.

In fact, I write this part responding to Ronal E.Day’s article The ‘Conduit Metaphor’ and the Nature and Politics of Information Studies, which is no doubt the hardest reading for me this week. I got totally confused especially about the Cold War context. I roughly sense the author is against Wiener’s claim about the conduit metaphor. In the paper, E.Day argued that

“The irony of this formulation is, of course, that both The Republic and Wiener’s The Human Use of Human Beings make their arguments using a rhetoric that is rich in metaphors and other literary tropes. Thus, both the epistemological and the social claims of Wiener (and as we have seen, Weaver’s) texts are simultaneously established and made problematic by the very rhetorical devices that operate in their texts. “

But both Plato’s cave and Wiener’s metaphor serve as a way to express a concept, rather than the base on which the concept is built. Just like my “computation” that ends with a metaphor. Both the computation and the circuit metaphor point to (but in different level of clarity) the same fact: I can get relatively more in reading if I can find both editions. It demonstrates that we can approach a same fact or concept in different ways, be it rational deduction or rhetorical metaphor. There’s a very profound tale in Chinese Buddhist sutras I’d like to share, for text convenience I skip the detailed names.

A nun went to a master seeking his interpretations on the Canon. The master said, “I can’t read Sanskrit, you read to me”. “If you can’t even read”, laughed the nun, “how can you claim to understand the Canon”.  The master pointed to the moon in the sky, explained, “The truth has nothing to do with text. The truth is like the moon above, and text is like my finger. My finger can point to where the truth is, but it doesn’t mean fingers are the truth. And it doesn’t mean one has to use fingers to see the moon”.

As a result, to me, the article The ‘Conduit Metaphor’ and the Nature and Politics of Information Studies is arguing that Wiener was using a  wrong finger and it has little to do with the moon. I’m not defending the conduit metaphor here, in fact, I don’t quite understand the metaphor. Things above are just my thoughts on the reading.

An Ouroboros

The first reading of this week is P.Denning’s The Information Paradox. I happened to be reading a book about famous paradoxes in the history these days, and it occurred to me that a great number of paradoxes were caused by self-reference: Liar’s paradox, Socratic paradox, Russel’s paradox. I used to think this ouroboros is philosophers’ and linguists’ problem, but now I know one of the most factual  discipline, mathematics, also suffered from this eternal ghost. This week’s reading pointed to another interesting thought experiment in history, Turing, and his Universal Turing Machine. The Gleick book doesn’t explain in detail how Turing solve the halting problem, so I looked it up in some other books and articles, e.g. Engine of Logic by Martin Davis and Cantor, Gödel, and Turing –  An Eternal Golden Diagonal by Weipeng Liu. The simplicity and universality of Turing machine amazed me, and the way he solved the Halting Problem, is self-reference once again. And by demonstrating that, Turing showed us a program is just another kind of data, no clear-cut demarcation. How heroic was Hilbert’s manifesto “Wir müssen wissen. Wir werden wissen“, but it seemed that we may not know.



Since the coding in Morse Code is based on frequency, why they don’t define certain letter combinations as code. e.g. th, er, in, con, tion, which has higher frequency in the English corpus than the least used single letters.


Luciano Floridi, Information: A Very Short Introduction. Oxford, UK: Oxford University Press, 2010.

James Gleick, Excerpts from The Information: A History, a Theory, a Flood. (New York, NY: Pantheon, 2011).

Peter Denning and Tim Bell, “The Information Paradox.” American Scientist, 100, Nov-Dec. 2012.

Ronald E. Day, “The ‘Conduit Metaphor’ and the Nature and Politics of Information Studies.” Journal of the American Society for Information Science 51, no. 9 (2000): 805-811.

Davis, Martin. Engines of Logic: Mathematicians and the Origin of the Computer. Reprint edition. New York: W. W. Norton & Company, 2001.

Weipeng Liu “Cantor, Gödel, and Turing –  An Eternal Golden Diagonal” Accessed October 19, 2016.

Book under a magnifying glass: affordances of a book

Chen Shen

Through all the readings in these weeks, we are fairly familiar with the concept of affordance and well prepared to employ this concept to daily experiences. Today I’ll try to analysis the affordances of a book.

Affordance as a relation

In Norman’s The Design of Everyday Things, he defined affordance as  “a relationship between the properties of an object and the capabilities of the agent that determine just how the object could possibly be used”. So affordance is a relationship between an object and the agent using it. Thus to say it’s meaningless to discuss the affordances of a book without addressing human as its user: human. Many of the form factors can obviously reflect the need of a human reader:  a typical 5*8 inch book has a rough measurement of 14*21 cm.


The height is nearly the length of one’s thumb to his little finger when spreading the palm (which is also a Chinese measurement unit, 一揸), giving readers a proper length to hold with either one or two hands, without being too big as newspapers which require either two hands holding or page folding, nor being too small to hold with palm naturally stretched-which would cause muscle strain if holding for long. The width and length bear a golden ratio which meets the aesthetic taste of the reader without conscious recognition. The width of a 5*8 book enable the reader’s eyes to scan through a whole line without the need of head moving, so reading can be a fairly stable and relaxing activity. As for the thickness, due to both the psychology of reader and the constraints of bookbinding, few books exceed 600 pages at which multiple volumes are ideal alternatives.


The three-dimensional figure also determines a book’s weight (if we don’t consider the paper type for the moment), a typical 5*8 book usually weighs less than 1000 grams, which also contributing to a long period of reading. Different types of books vary much in physical measurement and this variation is connected to how people use this book. For instance, a reference book, which is typically used in formal work which means stable surfaces are provided, so weight is not an important matter thus we have reference books way thicker and heavier than ordinary ones. On the contrary, a novel, which should fit a more casual reading scenario, seldom outweigh what a person can comfortably hold single-handed for long. If the work is voluminous,  In Search of Lost Time by Proust for example, different publishers chose to break it down to volumes in unison.


Another example of the relation between measurements and reading style is the magazine. With its noticeably larger size ,  magazine beats 5*8 book in the amount of information a page contains, but suffers in portability as well as durability (due to the different binding method). These characteristics make magazine as a material format of published medium highly suitable for magazine and periodical journal as a media type of reading material: ideally for skimming, better visual impact (that’s almost everything advertisements want), shorter production phase, less need for preserving in a long run as well as reading multiple times.


Affordance in anatomy of a book

A book affordances are so much more than the physical measurement. If we conceptualize a book’s physical part, it should be something like this:


Take Spine as an example, Spine is not typically an affordance for reading – during which the spine is imperceptible. The affordance of Spine is not dualistic but tribasic: the Spine of the book, the reader, and how people stack, preserve, exhibit book (as in the upper figure). Imagine a culture within which people don’t use book shelves and stack books in a way shown in the lower figure, Spine of the book then is totally for strengthening the book and has little symbolic value.


Now we know human reader shape the books into what they are today, in fact, this relation is mutual. In visual design, the visual direction is studied and exploited such as the Gutenberg diagram, which suggests users of western culture (and many other who read books in the Gutenberg format) typically move their sight from top-left to bottom-right, even when they are not reading a book. So designers can deploy  accordingly so that the most important information can be found in the psychologically prominent position.


Other affordances of physical books

Books also have other affordances, like the table of content and page number affords searching, the paper and ink printing affords marker highlighting, the margin and line spacing affords notes taking, the foreedge affords dog-earing and super-fast skimming, the endpaper affords inscription, the code bar in the back affords scanning, the index affords locating, the reference affords citation and further reading.

Books and eBooks

So much for the affordance of physical books, now we may examine the digital ones which are currently doing their best to emulate physical book reading experience. If we break down a physical book into two parts: physical part and symbolic part, digital books seem to have the ability to inherit everything symbolic. But the fact is a little more troublesome. Due to the advantage of adjustable font sizes of many digital books, there’s no unified form of how letters are presented. Usually, it is not a big problem, but when it comes to IT books which formatting is an essential part of coding, eBooks become utterly baffling. eBook is also trying to inherit physical books’ physical part. Kindle may be a typical example, by embracing the technology of electronic paper, Kindle reifies the physical part of books’ symbolic contents-the letters and pictures as cognitive artifacts. The e-ink technology enables Kindle to overcome two major problems of other eBooks: battery life and eye exhaustion by LCD. Though it has its own problem,  like the response speed of electric paper, the relatively small display, this quasi-paper solution is a promising way to transfer the mental model of books into a new experience of reading. eBooks sacrifice the physical characteristics of a book that a bibliophile cherishes but compensate with tricks no physical books can do. Some of the most appealing functions are text-based find, infinite portability, and collective reading. Though formal books have indices by which you can locate some of the keywords, eBook’s find-function outclasses books by orders of magnitude. So is the portability issue. Both these advantages are addressed fairly redundantly for me to explicate.  But the collective reading is worth mentioning. Kindle can show notes taken by other users reading the same book, which fundamentally change the nature of reading. Given time to improve the technology, reading is going to be a real-time multi-user cooperating task, thus to further strengthen books’ purpose as cognitive artifacts of informing and inspiring.

Though eBooks are emulating the reading manners of books, it doesn’t necessarily mean traditional book-reading manners is optimal and tolerates no improvement. It is more like a transition for people already familiar with physical books. For a generation raised in the Digital Age, the one with a firm mental model is digital books. Books are not innately prior to digital books, everyone watched the video A magazine is an iPad that does not work may feel the same.

Digital books’ downward compatibility implies eBooks are of a higher level in the revolution track of books, but there is still one problem eBooks fail to overcome at the moment. Studies have been done on the comparison of reading speed and retention. Results show that paper still has advantages over LCD. One reason contributing to this is reader perceive the following Word both in a symbolic way and physical way. And the physical position of a word in a page of a paper book help to form a mental map to retain the information.

It’s hard to compare different book forms in general, the Gutenberg printed book responded to the urging needs of mass publication. As a result, though it lost some of the aesthetic touches of manuscripts and come with so few fonts, it gradually replaces manuscripts as the need of a new era. eBooks probably are going to replace paper-based books eventually though paper books do have advantages, but when these advantages become incompatible with the new needs of book in a new era, paper books will become obsolete.


It took hundreds of years to shape physical book into the way they are, with affordances obvious and hidden. All these affordances aim for a better reading experience in general, but some constraints with the paper-based books are insurmountable. eBook has its own problems but is more compatible with the information revolution. Thus it can  possibly replace paper books.

Norman, Don. The Design of Everyday Things: Revised and Expanded Edition. Rev Exp edition. New York, New York: Basic Books, 2013.

Kump, Peter. Breakthrough Rapid Reading. Revised edition. Paramus, NJ: Prentice Hall Press, 1998.

“Design Principles: Compositional Flow And Rhythm.” Smashing Magazine, April 29, 2015.

Jabr F. Why the brain prefers paper. Scientific American. United States: Scientific American, Inc; 2013;309:48.

Janet Murray, Inventing the Medium: Principles of Interaction Design as a Cultural Practice. Cambridge, MA: MIT Press, 2012.

Kaptelinin, Victor. “Affordances.” The Encyclopedia of Human-Computer Interaction, 2nd Ed., 2013

Donald A. Norman, “Affordance, Conventions, and Design.” Interactions 6, no. 3 (May 1999): 38-43

Jiajie Zhang and Vimla L. Patel. “Distributed Cognition, Representation, and Affordance.” Pragmatics & Cognition 14, no. 2 (July 2006): 333-341






Alibaba’s Dimensions of Mediation

Chen Shen

Alibaba (Taobao) is China’s biggest online commerce company. Founded in 2003,  it accounts for 80% of China’s online shopping market, which is estimated to be 713 billion in 2017. There’re three typical ways to shop on Taobao: via PC’s web-browser, or using the Taobao app on smartphone or tablet. The main differences of these three ways lie in three aspects: layout, commodity detail, means of payment. Though one can basically buy the same things using each interface, the differences mentioned above cater to different users and leads to a major dichotomy. Here I mainly use the tablet interface as the subject of case study and try to figure out its different dimensions of mediation.

2PC Interface5Tablet Interface4Smartphone Interface

In Latour’s framework of Technical Mediation

  • The First Meaning: Interference

To rephrase the core concept of Latour’s first principle, I am no longer the same when I use this app, nor is this interface, “A third agent emerges from a fusion of the other two”. What’s more important, this new agent’s goal is different from the one I used to have. Personally, I can testify this change because many a time I was looking for a simple item but ended up with a whole basket of goods. It can rarely happen without the interface since if one is shopping in physical stores or unintegrated online store, there will be much less interference during the process of the transaction and much more to overcome for a whimsical desire. By simultaneously providing me pictures and links and sales, the interface changes my mental status from “I need something” to “do I need these other things”. The interface changes along. The original goal of the app is for browsing commodities, but when crossing over a user with paying ability, a new agent emerges whose goal is to make relatively optional and affordable purchases.

  • The Second Meaning: Composition

To illustrate the composition level of Taobao, I need to regard the interface as a subsystem and its subsystems as different agents. For example, my task is to seek opinions about a certain nib that no buyer’s comment is available at hand. First I can browse calligraphy supply stores using its subsystem of store-level searching; then locate a certain one using the subsystem of store scoring; then find some nibs in that store with the top-seller recommendation function; next step is to trace possible users that chose similar goods with me by looking through the transaction history; after finally locating a certain user whose opinion I value, I can contact him/her using Taobao’s module Aliwangwang, an IM app between buyers and sellers. In this chain of actions and subprograms, the actant for each one is a composition of the ones mobilized in its precursor.

  • The Third Meaning: The Folding of Time and Space

This aspect of mediation is rather easy to recognize for Taobao as an interface. It serves as a platform for a cornucopia of commodities,  each one enmeshed in an internet of things and has various histories. Even the interface itself, as in last part, is an aggregation of functional modules. In the code layer, the algorithms and data structure it employs also can trace back to the dawn of modern programming. Like a telescope in search of millions of stars, in our Taobao case, both “telescope” and “stars” have different dimensions of time and space wrapped in.

  • The Fourth Meaning: Crossing the Boundary between Signs and Things

In his A Collective of Humans and Nonhumans — Following Daedalus’ Labyrinth, Latour used the concept of delegation to explain how human are folded into nonhuman and vice versa. Using the interface of Taobao, the user, as enounciatee, also interact with different kinds of delegation designed and deployed by enouciators who are now absent in the process. For example, other buyers also play the part of the sleeping policeman. Still using the nib example, each piece of buyer comment under the commodity detail page is an active actant in my purchasing action. But the buyer is obviously absent here, I interact with him/her via the interface. By doing this, the human is folded into nonhuman. And by buying this nib, I myself also is folded into nonhuman, my preference, my comments, my transaction records will remain there to interact with coming users until eternity (or at least as long as the server continues). By this means, even by just purchasing a single nib, I interact with craftsman, calligrapher, designer, manufacturers, programmers, salesman, web designers, sales reps, deliveryman, and so on. Their work and labor, even happened long ago, are transferred, abstracted, and encoded in the form of either material or information, and swirled  together into this sociatechnological maelstrom.

  • From another point of view, and my concerns

Not totally agreeing with Latour’s approach to establish a symmetry and embrace a flatten concept of agency, Rammert adopted another model of agency in his work Where the Action Is: Distributed Agency Between Humans, Machines, and Programs. Actions are categorized as causality,  contingency, and intentionality. Rammer’s model relies on nature of the action, whether is mechanical and predetermined, or interactive and self-regulatory, or reflexive and intentional. Using the means above we can also analyze which parts of the Taobao interface are which level. But my concerns here is a noticeable trend  that human part in this interaction is continuously demoting from intentionality to contingency to causality. Taobao’s Chinese name, 淘宝, can be roughly translated into “seek treasure”, which clearly emphasized on human’s part of the seeker, who intentionally seeks what’s valuable to him. But as technology advances and interface intellegentizes, the routine process involves (even tolerates) less and less intentional actions. The ever expanding numbers of entries (800 million at the moment) objectively require users to choose without thorough thought, usually by some sorting algorithm which is rather opaque. And  subjectively, the recommendations, either based on purchasing history, or peer choices, or local trend, or seasonal trend or whatever, deprived  users discretion even further. A great part of online purchasing now is responsive rather than intentional. And in a near future, with sociotechnologies like Instant-Ink by HP or Dash Button by Amazon, some human actions will demote to simply irritation. In the meanwhile, the nonhuman part is taking more and more control in this hierarchical model of actions. One may say, isn’t it nice to free us from meaningless labor? But like in trivial actions, human concede predetermined, interactive, and intentional actions successively, isn’t is possible that we concede discretion in trivial matters, less trivial matters, and key matters successively. The fourth meaning of Latour indeed connected individual with more and more human and nonhuman agencies, but the more one is connected, the less urgent is his discretion. Time after time we see an apocalyptical world in science fictions where human is finally enslaved, it’s truly far beyond reach, but from small traces we can see that human is losing control, and not in a good way.

If you log in to, you can see a meticulously weaved net which ensnares treasures within my scope of interests. My question is, right now we may be the Arachne collecting goodies on the mesh, who’s to say there won’t be a day that we become the victims in this Daedalus’ maze and inevitably follow Ariadne’s thread to somewhere she meant us to?


A Philosophy of Technology: From Technical Artefacts to Sociotechnical Systems. San Rafael, CA: Morgan & Claypool Publishers, 2011

What is Mediology? Regis Debray, Le Monde Diplomatique, Aug., 1999.

A Collective of Humans and Nonhumans — Following Daedalus’s Labyrinth. Bruno Latour. Cambridge, MA: Harvard University Press, 1999

Where the Action Is: Distributed Agency Between Humans, Machines, and Programs.Werner Rammert, 2008

Working with Mediology and Actor Network Theory: How to De-Blackbox an iPhone.Martin Irvine