“Man the food-gatherer reappears incongruously as information gatherer” (McLuhan, 1967) (Gleick, pg 7).
The quote above struck me as especially poignant in today’s world of ubiquitous computing. Information is everywhere, enabling communication that bridges perceptions, distances and even foreign languages.
A relevant example in my opinion that may act as a microcosm of some of the theories we explored in the readings is Google Translate, an app that allows you to translate – in real time – one language to another. This is both reflective of and dependent on Shannon’s Theory, as it requires the following:
Info Source > Transmitter > SIGNAL > Receiver > Destination
(Language input) (New language)
It is not, as it may seem, “a room full of bilingual elves” working behind the scenes to convert one language into meaning for the receiver, but rather a microcosm of the manner in which Shannon’s theory works. Much like the diagram Kevin drew for us last week to explain how FaceID works on an Apple iPhone, there is a seemingly obscured process that goes on as the message is transmitted from point A to point B – in this case, from a native speaker of English, for example, to a native speaker of French, in their language, for them to be able to communicate. In this case, the input (the language) is then decoded to reflect the chosen, pre-programmed display. I found this video, explaining how Google Translate works to be quite illuminating:
Therefore, my understanding of it is that the difference between the e-information transmitted and received successfully depends largely on the receiver, circling back to the concepts of entropy in conjunction with the freedom of choice one has in the construction of communication.
• Irvine, Martin. “Introduction to the Technical Theory of Information” Feb. 4, 2019.
• James Gleick,The Information: A History, a Theory, a Flood. (New York, NY: Pantheon, 2011).
• Claude E. Shannon and Warren Weaver, The Mathematical Theory of Communication (Champaign, IL: University of Illinois, 1949).
After 20th century, our society has evolved from Industrial Age into Information Age, which means not only people can access information and knowledge very easily, but also the whole world can be regarded as a cosmic information-processing machine. According to Gleick, the bit is irreducible kernel and the information forms the very core of existence. Nowadays, almost very discipline is associated with computer and information. For instance, finance is recognizing itself as an information science because money itself is completing a developing step from matter to bits, stored in computer memory. Besides, our online blogs, pictures and videos are all kind of information in the form of bits (0 and 1) stored in computers.
The transmission of information among systems is dependent on E-information. Systems process information without regard to its meaning and simultaneously generate meaning in the experience of their users (Great principle of computing, 57).
This week’s reading mainly focuses on Shannon’s Information Theory. Based on the reading, we know that the main questions motivating the theory is how to design telephone system to carry the maximum account of information and how to correct for distortion on the lines.
To solve the question, Shannon converted all messages into binary digits known as bits. If we send messages in long distance, binary digits can be transmitted more completed because they can be read and repeated exactly.
Shannon also found that information strongly depends on the context. The rarer the information within the context, the more information it has, which means the first fraction of the message is far more important than others. It reminds me of the theory of time perception. As we get old, we often feel our time pass away much faster than before. The younger we are, the fresher experience we will have, and then the more information will be stored in our brain, which means we feel time pass away slower.
Nowadays, we require transmitting information in long distance, faster and more accurate. Shannon’s Information Theory plays a very important role in our real life, which can help us communicate in a better way: like avoiding entropy and disorder.
There is so much information around us. As Floridi puts it, Information is notorius for coming in many forms and having many meanings. Over the past decades , it has been common to adopt a General Definiton of Information (GDI), in terms of data and meaning. That means that we can manipulate it, encode it, decode it as long as the data must comply with the meanings (semantics) of a chosen system, code or language. There has been a transition from analogue data to digital data. The most obvious difference is that analog data can only record information (think of vinyl records) and digital data can encode information, rather than just recording it.
But how is the information measured?
Claude Shannon, in his publication “A mathematical theory of communication”, used the word bit, to measure information, and as he said, a bit is the smallest measuring unit of information. A bit has a single binary value, either 0 or 1.
When I think of information, I almost never associate it with data, but rather with meaning. In a way, information to me serves the function of communicating a message.
As Irvine explains, Information (in the electrical engineering context) is what can be measured and engineered for structuring physical, material signals or substrates (e.g., radio waves, binary digital electronic states in a physical medium like memory cells or Internet packets), which are then said to be the medium or “carrier” of data representations at the physical level.
In today’s society we share and receive information through different mediums. We live in a world where new technologies and devices are changing the way we communicate and interact with each-other. But, how did we get here? What were the main principles and technologies that lead to the invention of the internet and other technologies that we use to communicate today and share messages?
For my 506 project, I did research on the telegraph, and I was fascinated by how this groundbreaking device changed the way people communicated with each-other. Developed in the 1830s and 1840s by Samuel Morse (1791-1872) and other inventors, the telegraph revolutionized long-distance communication. The telegraph eliminated dependence on time and distance by connecting people through electricity and code. Although the telegraph had fallen out of widespread use by the start of the 21st century, replaced by the telephone, fax machine and internet, it laid the groundwork for the communications revolution that led to those later innovations.
Now we rely on the internet’s architecture to share and receive messages. The message that we want to send via email or any other application relies on the data packet switching principle across a network. First, the TCP protocol breaks data into packets or blocks. Then, the packets travel from router to router over the Internet using different paths, according to the IP protocol. Lastly, the TCP protocol reassembles the packets into the original whole, and that’s how the message is delivered.
In this video, Spotify engineer, Lynn Root and Vint Cerf, an Internet pioneer, explain what keeps the internet running, how information is broken down into packets and how messages are transmitted from one point to another.
The internet’s architecture and the design principles make the exchange of messages possible and keep us connected to each other. The information theory gives a different perspective on what information is and how it is measured.
As Gleick suggests, Shanon’s theory made a bridge between information and uncertainty; between information and entropy; and between information and chaos. It led to compact discs and fax machines, computers and cyberspace.
Floridi, Luciano. Information: A very short introduction. (Oxford University Press,2010)
Gleick, James. The Information: A History, a Theory, a Flood. (New York, NY: Pantheon, 2011)
Irvine, Martin. “Introduction to the Technical Theory of Information” (Information Theory + Semiotics)
Shannon, E. Claude and Weaver, Warren. The Mathematical Theory of Communication (Champaign, IL: University of Illinois, 1949).
The smooth, clean interface of our modern communicative technology rarely shows it, but there is a lot that goes on behind the scenes when we share information and interact with each other online. As Dr. Irvine (2019) writes in his introduction to understanding information, data, and meaning, “We only notice some of the complex layers for everything that needs to work together when something doesn’t work (e.g., web request fails, data doesn’t come in, Internet connectivity is lost)” (p. 4).
Back in 1838, Samuel Morse sent the first telegraph in the United States. This was important because it started an evolution in public discourse that hasn’t stopped since. According to White & Downs (2015), “[Morse] used a binary system– dots and dashes– to represent letters in the alphabet. Before Morse, smoke signals did much the same thing, using small and large puffs of smoke from fires” (p. 258). Morse’s binary system has transformed (relatively quickly!) into the data-driven communication of today, where binary code (1s and 0s) is grouped and delivered in the form of bytes, which are assembled in packets and sent across the internet. The packets are sequenced and reorganized after being received by the computer(s) on the other end of the transmission (White & Downs, p. 259). All our digital information and media, in their myriad of forms, begins with simple binary, but can be transformed into text, emojis, images, videos, audio, and other forms of dynamic communication.
Neil Postman (1986), a cultural theorist and author of Amusing Ourselves to Death, takes a page out of Marshall McLuhan’s book (“The medium is the message”) and makes a compelling argument about how the form of our communication directly impacts the content that is conveyed. He also references smoke signals like White & Brown did, which is what made me think of this quote (emphasis added):
It is an argument that fixes its attention on the forms of human conversation, and postulates that how we are obliged to conduct such conversations will have the strongest possible influence on what ideas we can conveniently express. And what ideas are convenient to express inevitably become the important content of a culture. I use the word “conversation” metaphorically to refer not only to speech but to all techniques and technologies that permit people of a particular culture to exchange messages. In this sense, all culture is a conversation or, more precisely, a corporation of conversations, conducted in a variety of symbolic modes. Our attention here is on how forms of public discourse regulate and even dictate what kind of content can issue from such forms. To take a simple example of what this means, consider the primitive technology of smoke signals. While I do not know exactly what content was once carried in the smoke signals of American Indians, I can safely guess that it did not include philosophical argument. Puffs of smoke are insufficiently complex to express ideas on the nature of existence, and even if they were not, a Cherokee philosopher would run short of either wood or blankets long before he reached his second axiom. You cannot use smoke to do philosophy. Its form excludes the content.
Why then, with the boundless potential of digital interaction, do we still struggle with miscommunication and a lack of civil public discourse? Part of the reason may be the sheer amount of information that is loaded into the web each day, hour, and even minute. It’s already almost impossible to keep up with the Niagara of data flooding across our digital screens, and the quantity is increasing at an exponential rate.
A 2017 blog post by Jeff Schultz of Micro Focus revealed that “90% of the data on the internet has been created since 2016, according to an IBM Marketing Cloud study.” The post also references the graphic below, which outlines some staggering statistics about internet usage and data transmission/consumption (by the minute!):
How can we better harness this grand and growing information system to better meet our communicative needs? Does the burden of responsibility fall to us as the producers and consumers of communication data, rather than to simply blame the messenger(s)? What can we expect in the next 5 years in the fields of information, data, and meaning?
This week’s reading introduced Shannon’s inmformation theory. What’s fascinating in his argument is that information is independent from meanings. He hold the idea that information can be measured and standardized. Information theory allows us to have a deeper understanding of information and data in a fundamental way.
In his paper, Shannon argued that “the fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.” (Shannon, 1948) For example, music can be thought of as the transmission of information from one point to another. To put it in a communication system, the sound of music is a message and an encoder generates a distinct signal for the message. Signals go through a channel that connects transmitter and receiver. A decoder on the receiver end converts the signals back into sound waves that we can perceive.
According to Shannon, “information is entropy.” Entropy is a measure of disorder or uncertainty about the state of a system. The more disordered a set of states is, the higher the entropy. Shannon considered entropy to be the measure of the inherent information in a source (Gleick, 2011). Denning also pointed out that Information is existing as physically observable patterns. Based on that, Febres and Jaffé found a way to classify different musical genres automatically.
Febres and Jaffé solved the music classification by using the entropy of MIDI files. A MIDI file is a digital representation of a piece of music that can be read by a wide variety of computers, music players and electronic instruments. Each file contains information about a piece of music’s pitch and velocity, volume, vibrato, and so on. This enables music to be reproduced accurately from one point to another. In fact, a MIDI file is composed of an ordered series of 0s and 1s, which allows them to compress each set of symbols into the minimum number necessary to generate the original music. After that, they measured the entropy associated with each piece of music based on the fundamental set. They eventually found that music from the same genre shared similar values for second order entropy. This case is an application of information theory, and it is really inspiring that information theory has the potential be applied into many other fields.
In this week’s readings we explored fundamental concepts around information and data. The nuances of the terminology used in different fields and how the meaning changes in between them: meaning, value, symbol, information, among others. Let’s attempt to de-Blackbox the process of sending a picture from one computer to another computer through the internet.
When making the argument of information theory, Shannon makes a distinction:
“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is, they refer to or are correlated according to some system with certain physical or conceptual entities.” (Shannon, 1949)
Applying this to the case of sending a picture, Shannon differentiates between the picture as we see it and understand it as a concept and the information being sent and received. He is saying that communication’s only concern is how to move the bits from one computer to the other one. In this particular case those bits represent an image, but it can be other types of data. To that sense he says:
“These semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual message is one selected from a set of possible messages. The system must be designed to operate for each possible selection, not just the one which will actually be chosen since this is unknown at the time of design.” (Shannon, 1949)
Therefore, the meaning of those bits does not change or impact the action of selecting sending a package of information and receiving it on the other side. The way I envision it is as a series of translations of different types of D-information in order to be exchanged and reproduced through E-information. In that sense, E-information is the bits, D-information is the type of data that those bits represent.
The picture is encoded into a certain type of structured data that is stored and processed by the computer:
“Computational devices (large or small) are designed to store and process bits (binary units) — millions of on/off switches — which are encoded, at a different design level in the system, as units of the data type that they represent.” (Irvine, Intro, pp.3)
Let’s say the size of our picture is 24mb. Those 24mb need to be received at the destination in order to be decoded again into the actual picture. In order to do that:
The file (picture) is divided into packages of encoded bits that will be sent through the network. The router, as the name suggests, send the packages through different routes. Every package has encoded information that states its origin, its destination, its number in the total of packages of the file, and the actual encoded information that is a partial size of the file.
“Programming “code,” when translated into these mathematical-electronic mappings, is designed to “encode” our symbolic “data” structures together with the logical and mathematical principles required for transforming them into new structures” (Irvine, Intro, pp.3)
Let’s say our picture of 24mb is divided into 4 packages of 6mb each:
The packages arrive to their final destination, after bouncing around through other serves in the network. They arrive not necessarily in a numerical order. The destination router receives the packages as they arrive.
The computer arranges the packages in the right order until it has the file in its total.
Our computer’s software decodes the file of 24mb into the visual representation of our picture.
To address one part of the prompt question for this week: How do we recognize the difference between E-information transmitted and received (successfully or unsuccessfully) and what a text message, an email message, social media post, or digital image means?
E-information transmitted and received are the packages of bits sent through the network. Meanwhile, our picture or digital image is D-information, data type in which these bits are structured to have the specific meaning of a digital image. Therefore, there is a process of encode-decode from the sender to the receiver. However, at the core of what Shannon proposed is the idea that the process of measuring and encoding information is independent from the meaning of said information.
A solid case study that can be conducted is on the Google Home devices — a home smart hub assistant that works through WiFi to work with and for you. Similarly to other industry leading smart home hubs, the Google Home solves the question bared in mind within the Gleick reading: mathematics can be decidable. Based on the ability today to program algorithms that computer systems and AI systems are able to “decide” on their own, they are, in their own way, “thinking” machines. We teach technology / computer systems information that is cultural so that when we communicate or send a message to it, it can give a response or send a message back, and the correct message at that.
The Gleick reading discussed the way in which the Bureau of the Census reported facts regarding communications in the U.S. as a representation of Bell Systems, newspapers, televisions, and radios. An argument was made in which how do we measure “communication?” Postal services counted communication based on letters, but how was the Bell System being measured? With George Campbell being hired, successful and unsuccessful measures of communication through the Bell System were quantified based on the success/failure of telephone transmissions ability to fully connect to signals. Claude Shannon’s initial vision was that communication problems are based off of the lack of accurate interpretation (1984). “Information” became a technical word that quantified information. Terms like force, mass, motion, and time had redefined physics. In this case, information helped redefine our ability to interpret, quantify, and understand communication. The process of mathematizing and quantifying terms such as information and energy was revolutionary, which is what made our world today run on information itself everywhere we go.What was extremely interesting was the articles theoretical question on what creates the relationship between information and people? The suggestion was “entanglement.” An intersectionality between humans and technology — a flow that is both directional and shared — is what makes information technology as advanced as it is today.
Google Home’s may use a mix of both structured and unstructured data (Irvine, Information Communication Theory). The structured data stems from the symbolic messages / information that we send to the device, regardless of whether it is speech or telecommunication. As we continue to use Google Home devices, Google Home begins to learn from us; our behaviors, what we commonly ask, our voice, our interests, and so on. That data is stored within Google Home (similarly to last weeks discussions on Convolution Neural Networks and how we train computational systems to learn information). The unstructured data can stem from the ‘big data’ that is collected from associated accounts to the one that is linked with Google Home. The primary account holder of a Google Home device can have their account linked to different platforms and services, such as YouTube, Amazon, and Gmail. Since data from such affiliated accounts/services is being pooled in together into one group, it is Google Home’s job to have that data readily accessible while users are in charge of initiating the command that may require Google Home to pull out data from the unstructured group. Data that is spewed back at the user, regardless of the messages medium, is based off of stored data that is collected and defined in order to make intelligent responses. Connecting back to the argument made in the Gleick reading about decidable mathematics, Google Home’s data is already predisposed to transmit a particular message based on what message is received, which is possible based on cultural learning that is then machine learned to create the information.
For this week, my goal was to understand information from multiple angles. From the Gleick reading, he focused on the theory of information and how it composes multiple layers. Some questions he asked were: Can machines think and what tasks were mechanical?This got me thinking about what’s actually being produced such as a word, image, or website, but also automatic or pre-determined, such as algorithm improvements. He says, “the justification lies in the fact that the human memory is necessarily limited…Humans solve problems with intuition, imagination, flashes of insight – arguably non mechanical calculation…” (pg. 15). This got me thinking about how absolute certainty plays a required role in machine computing which is able to include all preceding decimals for information. This makes me question, is information in a computer a tool or is it a machine within its own mechanics? Is information just based of gatherings and collections of signals?
Transitioning to the Irvine piece, I really enjoyed learning about the designs of information interfaces and how important the “signal” role is. I am still falling short on the signal transmission theory and the verbiage associated with it. It was difficult to unpack the question at hand, but I’m hoping I achieved a general sense of how to de-black box this.
The main features of signal transmission theory of information would be the digital design of “information” that is structured as “units” of “preserved structures” which use electricity via bits and bytes to extract certain patterns that signal an internal message that gets completed. The signal code transmission model is not a description of meaning because it’s not meant to describe “meanings”. It is designed as single units that are “point-to-point” models that display how “information” passes through a channel. There are data types, signs, tokens, data types, etc. that are involved in this encoding and decoding process. (Irvine, pg. 13-20).
Information theory model is essential for everything electronic and digital because digitized data, information, or tasks do not get performed without being instructed a certain way. This type of model helps ensure certainty with numbers, codes, or data that is transmitted electronically and received electronically, whereas it lacks the verbiage to further explain what each subset performs in terms of semantic meanings, or the specific uses and meanings of other systems. This could also be due to the information theory model is designed to DISPLAY how something is achieved electronically in the simplest way.
It doesn’t necessarily apply to the type of model that would fully explain sign and symbol systems, because it is laid out to explain how something gets done, vs. something of a “symbol” and that typically would not change within the model. Whereas, explaining a sign or a symbol in this type of language wouldn’t fully apply because the information model is based of E-signals that are transmitted and received, and in other cases symbols and signs systems are not.
To expand on this more with more easy to understand vocabulary, I wanted to pull in an outside source through video help me understand this better. When in doubt, YouTube it out! I found a great video that describes how speakers create sound waves, and how signals are transmitted. The user uses physical drawings to reproduce real life signals and explains how “Information” travels wirelessly. I recommend this video to the rest of the class to help us gather more simple terminology of this process.
This week’s readings mainly unveiled the Information Transmission Model raised by Claud Shannon, which summarizes a simple and unidirectional path showing how the signs and symbols being encoded, transmitted and decoded. There are six basic elements: information source that produces information, transmitter which encodes it to the signals, channel that adapted to the signal for transmission, receiver that decodes the message from the signals, destination which the message arrives, and noise that interferes the signals travel during the channel part. For example, in a conversation, the transmitter is the mouth; the other one’s ears are the receivers; the signals are the sound waves; while the noise could be others’ distraction from their environment. The brains are the information source and destination where their ideas going to be encoded to language words and the words heard being decoded.
Instagram is a popular social media application mainly based on photos sharing. When we upload our photos online, as photos are made of pixel patterns, the transmission process sends the pixels through online channels. Then the pixels are reconstructed and decoded on the software, being displayed on our mobile device as a recognizable photo instead of random pixels. When your friends comment on the posted photos, their input words are encoded as bytes, then transmitted in packets, and decoded on the device.
Professor Irvine claims that the meaning is not “in” the system, it is the system. Put another way, a message does not have meaning until people attach signs to the referents. Furthermore, whether information transmitted successfully depends on how the receivers interpret the message. The communicating groups should share and understand a common knowledge, which means to exchange message and interpret in “assumed context.” (Irvine, 12) For example, when a Korean friend comments my photo in Korean. As a receiver who can only speak Chinese and English, I cannot successfully interpret the Korean characters. They are meaningless for me. So the transmission process fails because the Korean friend and I do not share the common language. Also, it is clear that every information depends on people’s mind to attach referent and interpret.
All in all, there are two levels of communication transmission. Technically, information is encoded and transmitted by bytes, and then it is decoded to adapt to displaying on the device. As for the meaning level, senders attach signs to the referent with a specific meaning, while receivers decode to get the meaning. The success of transmission relies on both information communication theory and semiotics.
Irvine, Martin. “Introduction to the Technical Theory of Information.” Feb. 4, 2019.
Denning, Peter J. and Martell, Craig H. Great Principles of Computing. MIT Press, 2015.
The simplest kind of information system is a communication system, and the communication system includes encode, decode, transport, store, and retrieve data or signals. All formats of data, which includes numbers, signals, logic formulas, and text, etc. can be represented as patterns of bits. According to the information transmission model by Shannon, the starting point of transmission process is source, and the destination is the end of the tour. During information transmission, the message is encoded and readable by transmitter in the form of signal, and after being influenced and altered by noise in the environment, the signal as well as the noise can be received by the receiver. Then after decoding, the information can finally reach the destination.
The basic flow of image compression coding
One of the most important application of Shannon’s information theory is image compression. The main goal of image compression is to store and transmit data in an efficient form. Since the transmission chases for efficiency, the quality of image can be lossy, or lossless. JPEG is a commonly used method of lossy compression for digital images. It stands for Joint photographic experts group. It is the first international standard in image impression and widely used till today.
In the start of this process, the representation of the colors in the image is converted from RGB to YUV. Y represents brightness of a pixel, U and V which are two chroma components represents color, including tints and shades, and saturability. Since human eyes are more sensitive to the difference of brightness than that of color, so that in this process, the resolution of the chroma components is reduced. And this process is called downsampling. Then, each channel is split into 8*8 blocks, and the range of the pixels intensities now are from 0 to 255. Then we need to subtract 128 from each pixel value from -128 to 127. The next step is to use DCT, which stands for discrete cosine transform and round to the nearest integer. Because human eyes are not good at distinguishing the high frequency brightness variation, it allows us to reduce the amount of information in the high frequency components. Also, since many higher frequency components are rounded to zero and many others become small positive or negative numbers, the pixel takes fewer bits to represent.
the original image (left) and compression image (right)
The DCT-based image compression such as JPEG performs very well at moderating bit rates; however, the quality of the image decreases because of the encoding and decoding process so that the resolution of the transmitted image might not be as high as the previous one. With its important advantages, JPEG still becomes the most common format for storing and transmitting photographic images on the World Wide Web.
Candès, E. J., & Wakin, M. B. (2008). An introduction to compressive sampling. IEEE signal processing magazine, 25(2), 21-30.
Denning, P. J., & Martell, C. H. (2015). Great principles of computing. MIT Press.
Floridi, L. (2010). Information: A very short introduction. OUP Oxford.
Wei, W. Y. (2008). An introduction to image compression. National Taiwan University, Taipei, Taiwan, ROC.