Google Assistant

Her (Jonze, 2013) installing OS1/Samantha

This week’s focus is on A.I. and specifically virtual assistants. As a fan of cinema, sci-fi, and representation of technology in the moving image, I can’t help but think of a few examples such as Her (Jonze, 2013), A.I. Artificial Intelligence (Spielberg, 2001), Ex-Machina (Garland, 2015), Blade Runner (Scott, 1982), Minority Report (Spielberg, 2002), 2001: A Space Odissey (Kubrick, 1968), and the list goes on and on.

I must confess that I’m not a big fan of voice recognition virtual assistants. I don’t have an Amazon Echo, Google Home and I’ve deactivated the “listen for Hey Siri” option on my iPhone. Digging deeper into the reasons for my dislike, I’ve come to the conclusion that it has to be because I was first exposed to all these dystopian films before being given the tools to actually understand how do the technology works. These fictional representations often present these technologies exaggerated/distorted with some ‘truth’ at its core. Watching these films doesn’t necessarily prevent me from de-blackboxing AI or voice recognition virtual assistants, but it definitely provides a filter through which we can understand not only how they work but how users understand and interact with them

While reading through the Google Assistant patent I was surprised at finding that, although most of the specifications are too technical for my understanding, the main description of its use and purpose was very accessible and even more clarifying than most attempts from articles to ‘unveil’ the mystery to the reader.

The patent reads:

“According to various embodiments of the present invention, intelligent automated assistant systems may be configured, designed, and/or operable to provide various different types of operations, functionalities, and/or features, and/or to combine a plurality of features, operations, and applications of an electronic device on which it is installed.”

Based on this excerpt, the patent describes the system as an intermediary between the user and many possible outcomes/actions that are already available in the devices, accessible through different modes of interaction.

If we look into the different levels/layers/steps into how Google Assistant works, the patent describes:

  • “…actively eliciting input from a user,
  • interpreting user intent,
  • disambiguating among competing interpretations,
  • requesting and receiving clarifying information as needed,
  • and performing (or initiating) actions based on the discerned intent.

Those actions can vary from activating and/or interacting with other applications and services already on the device, or accessible through the Internet: it can perform a google search on your question and provide answers, it can activate google maps or Spotify, it can perform e-commerce interactions such as buying things on Amazon, among others.

Some of the language used through the description in the patent was interesting to me. At one point it says, “[thanks to the assistant] The user can thereby be relieved of the burden of learning what functionality may be available on the device and on web-connected services, how to interface with such services to get what he or she wants, and how to interpret the output received from such services; rather, the assistant… can act as a go-between between the user and such diverse services.”

Oh to be relieved of the burden of learning how something works. This [insert any technology here] makes life so much easier we shouldn’t concern ourselves with the technicalities of how does it work.

I will admit that the benefits of voice recognition virtual assistants are massive for different communities and fields of work. The patent describes in detail how this serves people with disabilities and users who work handling machinery and cannot interact with devices at the same time without shifting their attention, which could be possibly dangerous. Not just for work, a great example is making a call or searching for something while driving.

Although all of this is true and valid, it must be acknowledged that it also opens the door to many vulnerabilities and security issues for users, as many technologies do. Cases of stolen identity, e-commerce fraud, home security, children protection, scams, etc. Last year, the New York Times published an article regarding research studies from various US and China universities on malicious use of these technologies, specifically “Berkeley researchers published a research paper that went further, saying they could embed commands directly into recordings of music or spoken text. So while a human listener hears someone talking or an orchestra playing, Amazon’s Echo speaker might hear an instruction to add something to your shopping list.

Therefore, there should be concern. Not the dystopian sci-fi movie’s fear around technology taking over, but about humans using these technologies to take advantage of the users. As much as I love/hate the greatest villain in film (in my humble opinion) Hal9000, I admit the threat of an embedded hidden command that I cannot hear but Echo can, seems exponentially more terrifying.

2001: A Space Odyssey (Kubrick, 1968). Hal9000