Abstract
The main topic of this paper is Augmented Reality (AR), what is the technology behind it, what are its affordances and limitations of it. In order to do so, I will mainly borrow some concepts and ideas from three main frameworks. First, we will approach the technology behind Augmented Reality since the 90’s until the present day using as a base Azumas (1997) review of Augmented Reality Systems and the Augmented Reality Markup Language Specification (ARML 2.0) proposal by Open Geospatial Consortium (OGC). The Second topic that will be discussed here is how AR, as a new digital medium, is remediated our real world experience. Finally, we will move on to how AR systems enable us to distribute our cognitive process, to make the user work better.
Keywords: Augmented Reality, Remediation, Distributed Cognition
“When parts of the environment are coupled to the brain in the right way, they become parts of the mind” – Andy Clark.
Introduction
In the short movie Sight we are presented with a future where people use the sight system, which consists on eyes implants that allow individuals to have different layers of digital data on the physical world. This future is not so far from were we are right know, considering advancements in mobile devices and wearable technology, such as Google Glass. Although none of the devices we have right now have the capabilities shown in Sight, many individuals and institutions are working on creating and making mix and augmented reality possible.
As shown in Sight this type of technology brings about many questions of privacy, technology dependence, the gamemifications of social life and so forth. It is because of this that I believe that is important to analyze it and deepblackbox it in order to understand how it works and the possibilities that mix and augmented (MAR) reality poses to people. The development and advances on mobile devices and smartphones has made it possible for more people, institution and companies to explore and create augmented reality (AR) apps.
Doing a search on Apple’s Appstore in November of 2014 would reveal 500 + AR apps, most of them focusing in games and advertisement. One of the most interesting one is Wikitude, which has different feature, including augmenting advertisements and billboards. One of the features that I found compelling is the one that gives the user layer of geographical information of places near him (Here is a video of how it works, however it is not showing the more updated version of the app.). The user is presented with different layer options that range from TripAdvisor to Wikipedia. Let say we choose to use the Wikipedia layer, this would create a layer on your smartphone with “balloons” that pinpoint the location of places that have been geotagged in Wikipedia entries and articles. This allows the user to move around her/his physical space with more information about it.
- Balloons and Radar
- Layer for TripAdvisor
- Layer for Wikipedia
This is app is one of the examples of what can be done with AR apps. In the course of the next few lines I will explore this and other application, what affordances they have and what are the implications in different areas, such as art, education, medicine, military and manufactory.
In order to flesh out all the implications and possibilities that AR presents us I will use different approaches to have a better understanding of this new medium. The paper will have several sections, each one focusing on a specific characteristic of AR. I will be begin discussion with a review of the development of MAR applications, following Brian Arthur’s (2009) discussion I will explore the different technologies that come into play to make possible AR. Here we will see the different paths that AR devices work, the software that it uses, how it can be used in different fields and so on.
After reviewing the development of AR, I will move on to explore how AR, as a new medium is remediating our physical world. In order to do so Bolter’s (2000) ideas on remediation, immediacy and hyperimmediacy will come in handy. In this section I will discuss how AR is remediated our physical plane and which other mediums are combined and used to make AR possible. As Bolter (2000; 23) discusses digital technologies, he argues that they try to create an “interfaceless” interface, which is what we see the short movie Sight. The main focus in this section will be to understand how AR is changing our perception of the “real” and “digital” world.
Following the discussion on remediation, I will continue and argue how AR applications are enabling individuals to offload parts of the cognitive processes onto them. Here I will base my arguments on ideas and concepts presented by Andy Clark (2008) and Edwin Hutchins (2014). As both authors discuss, cognition does not only happen within our brains, but it is distributed between our internal processes and our various devices. As we will see, we create scaffolding that enables us to move some processes out of our mind and free space to achieve better performances. In the case of AR applications we are able to see various examples of this in fields such as medicine and manufactory and repairs.
Finally, I will conclude the paper tying some of the ideas discussed through the course of this text. The main goal that I am seeking to achieve here is to have a better understanding of what Augmented Reality is, how it works and what can be done with it.
Augmented Reality
Before starting the discussion on AR, how it remediates our world and how it helps us offload parts of our cognitive process, I believe it is necessary to have a definition of what Augmented Reality really is. As Azuma (1997) notes, AR is part of what is know as virtual environments (VE). AR is situated within a continuum between virtual environment and real environment (see). At one end of the spectrum we have virtual reality (VR), which is characterize to immerse the user into a digital world, usually by using a headset (Oculus Rift). In contrast to VR, AR mixes the real environment with digital graphics and data (Azuma, 1997; 2), although AR applications can work with headsets or other types of devices (smartphones). For Azuma (ibid.) there three essential characteristic that define an application as AR. First it has to mix the real with the virtual, it has to be interactive in real time and it is registered in 3D. This definition will give us a framework to work with AR applications and will keep us on track while discussing how it applies in different situations.
Combination of Technologies
So far we have defined AR systems, but we have not really got into what is inside them, what is it that makes AR possible. In 1997 Azuma wrote an article reviewing the different projects and technologies that were being explored during the last decade of 20th century around AR systems. He discussed the different devices being used, the limitations and some of the applications AR could have in different fields, such as in medicine and manufactory.
This section will focus in tracing the evolution of AR systems, going through some of ideas presented by Azuma and then continue to other developments in field in the last few years. This is an important step towards understanding how AR works and what can user do with it. Arthur (2009) notes new and innovative technologies are “sired” by the combination of old technologies. During Azuma’s review, it is possible to see that during the 90’s they were experimenting with different hardware in order to create the AR systems. This is a great example of Arthur’s idea of combinatorial evolution. Arthur (2009; 18) argues that new technologies come into being by inheriting old ones. In the case of AR there were to different methods of creating an augmented reality experience, optical or video Head-Up Displays (HUD).
As we can see in A1 with an optical HUD the person would be using optical combiners in front of their eyes in which combiners would project the 3D rendering of information. This setup would allow the user to see both the real world and the digital information in real time. In the case of the video HUD, seen in A2, the user would be using a kind of helmet that would not allow her/him to see the real world, however the HUD would be transmitting the images from the real world from the cameras. The monitors in front of the user’s eyes would render an image, almost in real, time of the real world and the 3D visualization (Azuma, 1997; 12).
At the moment that Azuma was carrying out his review, there were a few problems with both of these methods. In the case of the optical HUD, it allowed better mobility, simplicity and resolution, however one of the problems that Azuma saw in the optical HUD was that the 3D visualization would some times not match the real world, by this I mean that the monitors and combiners would place the digital content in the right location.
Now regarding the video HUD we have another set of problems. The way the video capture works and how the device renders the both the real world stream and the digital data stream, the user would get a view where both the real world and the 3D visualization would match perfectly, the device would be even able to erase objects of the real world, but it would also impaired the mobility of the person using the AR system.
A side from theses devices that are needed for AR experiences, we also have to think of other objects outside of the hardware that enable the AR interaction. At the time Azuma was writing his review, Geographical Positioning Systems (GPS) was not reliable enough to use in very specific locations, however over time GPS improved, and as we will see further down the road is being used as a standard in the new language that is being develop for Augmented Reality Web Browser (such as Wikitude and Argorn). One other way that digital content can be positioned in the AR experience is by using fiducials markers, such as QR-codes and target images (see B1). The use of this type of objects assure that the content will be displayed where is supposed to be, although it has its limitations, because the camera needs to have a clear view of the fiducial markers.
It is important to review the devices, softwares and objects that enabled an AR experiences because it is from here that new techniques were developed that would enable us to have better devices for AR. If we move forward to today, we are now able to use mobile devices such as smartphones and tablets as AR systems, which have integrated in them location sensors (GPS), direction sensors (magnetometer), orientation sensors (accelerometer or gyroscope) and cameras. Our smartphones now can serve as video HUD systems for augmented reality, both with fiducials markers and with GPS locations. The app called Wikitude discussed ate the beginning of this paper is a great example of how AR apps use GPS to locate places and to help individuals travel through their physical world. AR apps that use fiducials markers are more common, such Anatomy 4D, ColAR and Lunch Rush, all of which use target images and QR codes to render the 3D models on the screen of your mobile device.
Standards
In the last section we spent some time discussing the different technologies that come into play in order to make AR possible, however just have the different technologies will not result in augmented reality. In order to make a augmented reality feasible different institutions (in this case Wikitude GmbH, Georgia Tech, The University of Alabama Huntsville and CACI international) have to come into and agreement on what are the best ways to make all these technologies interact between each other.
In 2013, the Open Geospatial Consortium (OGC) developed and presented the Augmented Reality Markup Language 2.0 (ARML 2.0) in order to create a better platform on where to build augmented reality applications. In their specifications, the OGC also uses Azumas (1997) characteristics to define what can be understood as Augmented Reality systems.
The ARML 2.0 is based in XML and uses different extension to work with locations, fiducials and visual assets (digital data). One of the most interesting tags explained in ARML and in the Argon AR Web Browser (MacIntyre, 2011) is the extension created for KML, for geographical information based in GoogleEarh and GoogleMaps. The extension is named KARML, which lets the author of new AR application specify where the digital assets live in the real world. Additionally it enables the author to position other elements relative to the position of other digital assets.
So far we have focused in the technology and software that comes together to make AR systems possible. All this discussion is of great importance because it makes visible all the interactions that need to happen in order to get and AR app to work well. By deepblackboxing augmented reality system and the technologies behind it we gain a better understanding of this technology and we will be able to use it more meaningful ways. It is important to note that this discussion does not delve into the deepest interactions of the technologies within augmented reality, however it is a first step and a great foundation to understand how AR remediates our physical world and also serves as a tool to offload our cognitive process.
Remediation of the Real World
As we saw in the previous sections AR systems are made by combining several different technologies, such 3D visualization, monitors, combiners, cameras and Geographical Positioning Systems (GPS). However, AR systems are not only a combination of these technologies, but also a combination of different mediums. As a new digital medium, AR has not yet been encapsulated on what it really is as a medium and this opens up a door to experimentation. Engberg explains once we establish the “medium possesses essential characteristics, then it follows that the designer has little choice but to foster and develop them” (2014; 5). Several artist (Holloway-Attaway, 2014; Papagiannis, 2014; Tinnell, 2014; Samanci, 2014; have) have created various expositions were they play with the different affordances that AR provides for them. On such example is the collective Manifest.AR that have participated in different acts of protest (Occupy Wallstreet Movement) and created guerilla style art exposition We AR the MoMa with digital art. Manifest.AR is showing that AR art is capable of changing the way experiment physical space by putting an uninvited exhibition that people could choose to see by the side of other pieces of art.
Using augmented reality apps changes the way we perceive our physical space. As we use our smartphone to look and move through, not just the streets, but information. In the interactive installation [inbox] (Barba, 2014; 47) there were several containers in which participants would get it and through their handheld device would be able to interact with the installation. One of the interesting ways people could interact with the digital information was by moving forward or backwards. By getting closer to the objects, the digital objects would zoom in and people would have another perspective of the objects. These types of interaction with our physical world change how information is being transmitted to us.
It is important to remember that Augmented Reality is part of a continuum between the virtual environment and the real environment, and as Virtual Reality, it is meant to immerse the user in the medium. As Bolter (2000) explains VR is a medium that is supposed to become transparent, in which people forget that they are in a mediated space. In the case of AR, we might not be immerse in different world, however, both the digital information and the physical plane are being combine in a way that we get a collage of information and senses.
This type of new digital medium is also a great example of how old media as text, TV and even the Internet is being remediated. As Bolter (2000) suggests, new mediums appropriate older media and adapts it. Lets take as an example the app Aisle 411, which is an AR app created by Wallgreens to help people navigate their stores. In this case, the AR app is remediated our experience shopping by showing us where the items that we need are located around the store. In our smartphones’ screen we get several queues on how to move around the store, represented by images of the items we are looking for, arrows pointing to the direction we need to go and 2D map showing us the location of the user and other items. In the next section we will be discussing this ideas more carefully.
Hypermediacy, Impressionism and Interfaces
As we have been discussing so far, new media often borrows some characteristics from old media and changes others. Tinnel (2014) argument on the emergence of the impressionism movement is genuinely interesting. He makes the point that impressionist painters created their new style as a response of the introduction of the photography. Photography became the peak of the representation of reality, while the paintings were more subjective. Tinnel (2014) continues his argument stating that in impressionist paintings, the style is looking to represent the movement of real life. The point Tinnel is trying to make is that AR art is new recreation of this style due to the sense that the digital objects that are created mesh with reality and are always interacting with the user of the AR systems.
Now, moving away from Tinnel ideas about impressionism and AR, lets go back to hypermediacy. Bolter explains that hypermediacy “offers a heterogeneous space, in which representation is conceived of not as a window on to the world, but rather as windowed itself” (2000, 74). Here I believe is useful to bring impressionism and hypermediacy together. As Bolter explains hypermediacy is looking for blurring the lines between what we perceived as a mediated space and one that is not. As Tinnel pointed out, the impressionist style was trying to capture the movement of real life, as hypermediacy and in particular within AR apps, the aesthetic is trying to capture the mixture of both digital and real objects, and blur them into one continue experience. It is important to mention that “hypermediacy can operate even in a single and apparently unified medium, particularly when the illusion of realistic representation is somehow stretched or altogether ruptured” (Bolter, 2014; 34). In other words, even if the user can sense the difference between the real and the digital, hypermediacy is able to maintain the illusion of the immersive experience.
In the end, AR application are remediating our environment, “augmenting” our experience with more information while we read magazines, books or just going grocery shopping.
Distributed cognition
After discussing how AR application are remediating our real environment we will move forward and discuss how this new technology is a great example of how the cognitive process can be distributed outside our brains into different devices. The classic example from Clark (1998) is that of Otto, an Alzheimer patient, who uses a notebook to remember things he knew before or he just learned. As Clark explains, Otto creates a cognitive loop between him and the notebook and in doing so, the notebook becomes part of his memory.
In Azuma’s (1997) review of AR systems he presents several application that shows how augmented reality serves as an extension of the mind. One of the fields he talks about is manufactory and repair. As he continues to explain, the augmented reality application would help repairmen in their job by overlying a layer of information of the machine they are fixing. Normally mechanics would need either to learn everything by memory or have with them a book with all the instructions. Having the AR system would help saving time and space in fixing the machine.
Another such example is the AR app the Volkswagen has been trying in the last few years that would give the user, regardless if he is a mechanic or not, the ability to fix the engine of his car. The AR app would recognize each part of the engine; let the user know all the information about the parts and how to fix it. By using this type of application, the user creating a cognitive loop that allow individuals to have more information to carry out everyday activities with more ease.
Conclusion and Questions
In the course of this paper I have discuss several characteristic of Augmented Reality applications. First I started by going through the various technologies that come into play to make AR applications possible. This first part serve as a starting point because it gave us a precise notion of what AR is and how it works. Here we also discuss the advancement that have been done in the last few years as the Augmented Reality Markup Language 2.0 (ARML 2.0), that has been develop by the OGC to have a common platform to build AR applications in the same standard language.
Afterwards I discussed how AR applications, as new digital mediums, are remediating our real environment. As Bolter (2000) explains new digital mediums usually remediates old media in a new space. In the case of AR, we see that several old medias coexist within this new medium. The interesting thing is that AR blurs the line between the real and the digital; making the objects we see everyday interactive in new ways.
Finally, the AR applications also allow individuals to create cognitive loops with them and in doing so people offload information from the cognitive process, free up space in the mind. At the beginning of this paper we talk about the short film Sight. In that future (maybe it is not that far), as discussed earlier, individuals have eye implants that serve as AR systems with which they play, they cook and even get help with dates. As we are able to see in the film, everyday life is hypermediated by games, advertisement and other type of information. At the same time, the AR system lets the users to free space to carry out other processes more easily, such as ordering food, measuring and taking care of simple tasks. I hope that by this point, everyone is aware of the importance of this new digital medium and the capabilities it has. It truthfully important to learn how this new technologies work and function in order to be able to shape them in way that they will be an extension of us, instead of us being tied to them.
Bibliography
Azuma, R. (1997). “A Survey of Augmented Reality” in Presence: Teleoperators and Virtual Environments 6 (4). Pp. 355-385.
Barba, E. ((2014). “Toward a language of mixed reality in the continuity style” in Convergence 20(1). Pp. 41-54.
Bolter, J. (2000). “Remediation: Understanding New Media” MIT Press.
Clark, A. (2008). “Supersizing the Mind:Embodiment, Action,and Cognitive Extension”. Oxford University Press, New York.
Clark, A. (1998). “The Extended Mind” in Analysis 58(1). Pp. 7-19.
Engberg, M; Bolter, J. (2014). “Cultural expression in augmented and mixed reality” in Convergence 20(1). Pp. 3-9
Gurevicht, L. (2014). “Google Warming: Google Earth as eco-machinima” in Convergence 20(1). Pp. 85-107.
Holloway-Attaway, L. (2014). “Performing materialities: Exploring mixed media reality and Moby-Dick” in Convergence 20(1). Pp. 55-68.
Hutchings, E. (2013). “The cultural ecosystem of human Cognition” in Philosophical Psychology 27(1). Pp 34-49.
MacIntyre, B; et al. (2001) “Augmented Reality as a New Media Experience” ISAR ’01 Proceedings of the IEEE and ACM International Symposium on Augmented Reality (ISAR’01).
MacIntyre, B; et al. (2011). “The Argon AR Web Browser and Standards-based AR Application Environment” In ISMAR.
Papagiannis, H. (2014). “Working towards defining an aesthetics of augmented reality: A medium in transition” in Convergence 20(1). Pp. 33-40.
Tinnell, J. (2014). “Augmented reality and impressionist aesthetics” in Convergence 20(1). Pp. 69-84.
Samanci, O. (2014). “Embodied site-specific animation” in Convergence 20(1). Pp. 14-24.
Starling, A. (2014). “Invisible visualities: Augmented reality art and the contemporary media ecology” in Convergence 20(1). Pp. 25-32.