Author Archives: Martin Irvine

About Martin Irvine

Martin Irvine is a professor at Georgetown University and the Founding Director of Georgetown's graduate program in Communication, Culture & Technology. He is interested in a wide range of interdisciplinary topics, including media theory, semiotics, cognitive science approaches to language and symbolic culture, computation and the Internet/Web, philosophy and intellectual history, art theory, contemporary music, vintage guitars, and all things post-postmodern.

Group Project Instructions

Group Short Research Project

The purpose of the group short research project is get some practice in thinking collaboratively with the methods and key concepts of the course on real-world questions, cases, or examples involving the technologies we are studying. Why? In your careers and professional life after CCT, everyone will be working collaboratively with colleagues and teams of people in which the technologies that we are studying will be important and about which decisions will need to be made. You can be in a position to help “de-blackbox” (de-mystify, open up for understanding) by explaining how the technologies in question are designed artefacts (not closed, “techies-only” products), and critiquing false or misconceived views. Any steps you take in learning how to do this will enable you to take on leadership role, and not simply continue being a consumer or user who defers to the “techie” people to make decisions.

Framework and Context for Your Report

For your group project report, imagine taking on the role of participating in a committee in an organization needing to implement a major AI/ML, data analytics, or Cloud application in your organization’s sector or domain. Explain the significance of a problem, question, or aspect of the technologies in a way that a non-technical person can understand and appreciate for making better-informed decisions. One way could be explaining a de-backboxing method that allows the design principles to be accessible, what the consequences of certain kinds or designs are, and/or expose some ways that your topic has been misunderstood or misrepresented in the news, and unethically promoted as a closed corporate-branded “product” or “solution” rather than truthfully explained and described.

Post Your Report

Plan on a 15-minute report. Post your report in outline format as “talking points” or bullet points that you will discuss in class. Use any graphics or supporting content for your presentation: diagrams, images, videos (short).

Cloud Architecture: extensions of design principles

Extensions of Design and Directions and Network Computing

  1. Convergence of technologies: computing, digital media/data, infrastructure, software, systems
  2. Complex system design: combinatorial modules, abstraction, hierarchical levels/layers
  3. Client/Server architecture: scaling to complex levels and unlimited capacity
  4. Extensible principles of the Internet and Web
  5. Distributed systems
  6. PCs and mobile devices: design logic of “local” and “remote” computation
  7. Mobile and smart phone design: software apps and hardware. “Native” (local) functions and remote (server, Cloud) functions
  8. Extensibility and Scalability
  9. The Cloud and implementing the “Whole Stack” of client/server local/remote computing

Goals:

Truth in explanation!

Some points on logic, philosophy, and theory for data, ML, & NLP

Data and representation: Types and Tokens

Digital representation and formatting data types is based on the whole computing/information architecture design for tokenization (representation instances) and re-tokenization (“copies,” further instances, interpretations of tokens output as additional tokens).

“Data types” are techniques for bundling, labeling, or encoding (i.e., with meta-information) bit and byte units as tokens of defined types for computability. (We must assign different kinds of computable processes for text strings in defined languages, for types of number representations, and for matrices — value arrays — used in digital images.) Each type can have unlimited token instances across the computing architecture at several levels: all forms of memory, the representations in processing units (CPUs, GPUs, and codecs +transducers), data units as Internet packets, etc.

The discourses of “Neural Nets” and “Deep Learning”

Neural nets are mathematical graphs are designed to do fast statistical calculations through many layers of sorting and weighting toward a designed goal. The whole point is pattern recognition and pattern matching, based on human perceptual inferences and pattern recognition capacities as part of human symbolic cognition.

We can design algorithms as pattern recognizers over data because the patterns are human-generated patterns represented in the computable tokens. Pattern-recognizing AI/ML and NLP models are thus projections from human symbolic capabilities, which give us the abilities of multi-leveled abstraction, generalization, and combining types of representations (sign and symbol systems).

“Deep Learning” is not “deep” as in ordinary language “deep/depth of knowledge,” or cumulative history of knowledge and learning. “Deep Learning” means adding many more graph layers with recursive (recurrent) pattern analysis.

Confused and confusing philosophy in ML

Poibeau, Conclusion

/197/ To conclude this journey, we would like to say a few words
about cognitive issues. The most active researchers in the field of
machine translation generally avoid addressing cognitive issues
and make few parallels with the way humans perform a
translation. The artificial intelligence domain has suffered from
spectacular and inflated claims too much in the past, and in
relation to systems that had nothing to do with the way humans
think or reason. It may thus seem reasonable to focus on
technological issues and leave any parallel with human behavior
aside, especially because we do not, in fact, know much about
the way the human brain works.

However, it may be interesting in this conclusion to have a
look at cognitive issues despite what has just been said, because
the evolution of the field of machine translation is arguably
highly relevant from this point of view. The first systems were
based on dictionaries and rules and on the assumption that it was
necessary to encode all kinds of knowledge in the source and
target languages in order to produce a relevant translation. This
approach largely failed because information is often partial and
sometimes contradictory, and knowledge is contextual and
fuzzy. Moreover, no one really knows what knowledge is, or /198/
where it begins and where it ends. In other words, developing an
efficient system of rules for machine translation cannot be
carried out efficiently by humans, since the task is potentially
infinite and it is not clear what should be encoded in practice.

Statistical systems then seemed like a good solution, since
these systems are able to efficiently calculate complex contextual
representations for thousands of words and expressions. This is
something the brain probably does in a very different way, but
nevertheless very efficiently: we have seen in chapter 2 that any
language is full of ambiguities (cf. “the bank of a river” vs. “the
bank that lends money”). Humans are not bothered at all by
these ambiguities: most of the time we choose the right meaning
in context without even considering the other meanings. In “I
went to the bank to negotiate a mortgage,” it is clear that the word
“bank” refers to the lending institution, and the fact that there is
another meaning for “bank” is simply ignored by most humans.
A computer still has to consider all options, but at least statistical
systems offer interesting and efficient ways to model word senses
based on the context of use.