Undoubtedly, the world is undergoing a technological revolution. This revolution is changing every aspect of our modern daily life and is evident in areas such as “finance, transport, housing, food, environment, industry, health, welfare, defense, education, science, and more.” According to the reading for this class, this revolution stems from the perfect match and combination of Big data, Cloud Platforms, and AI/ML. In week six, we have learned how AI/ML “hungry neural nets” use massive amounts of data for pattern recognition, and then make predictions based on already trained patterns to analyze new data. Last week, we dug deep in the definition and architecture of Cloud computing and identified the importance of Cloud platforms to AI/ML and Data systems. For this week, I will delve into the world of Big data by explaining the key concepts of this revolutionary technology and elucidate how Big data exists because of Cloud computing.
Relatively, Big Data is a young term that was first used in the 1990s. Similar to Cloud Computing, there is no agreed academic definition of the term. The most common definition of Big data, mentioned in Rob Kitchin’s book, “refers to handling and analysis of massive datasets” and “makes reference to the 3Vs; volume, velocity and variety.” According to these 3Vs, Big data is “huge in volume,” “high in velocity” and “diverse in variety in type.” For Johnson and Denning, the Big data revolution occurred due to the “convergence of two trends: the expansion of the internet into billions of computing devices, and the digitization of almost everything.” In the sense that the internet provides access to massive amounts of data and digitization makes almost everything digital. There is a strong relationship between Big data, AI/ML and Cloud computing. Without Cloud computing, it is impossible for Big Data to exist. In the real world, the main providers of Cloud services provide the infrastructure and services for AI/ML and Big data to thrive. These providers use the concept of convergence to combine the three in one system. Through this system, unstructured Big data is classified, sorted, and analyzed by hungry neural net algorithms provided by AI/ML technologies, and the outputs are saved in cheap memories provided by Cloud computing ubiquitous servers. From this quick analysis, we infer that without AI/ML’s algorithms training, unstructured Big data can’t be classified and sorted. Also, without the infrastructure provided by Cloud computing, AI/ML processes of Big data can’t be implemented.
The readings for this week varied between optimistic and pessimistic in the way they think of Big data developments; socially, technologically, educationally and application wise. What really resonates for me is Cathy O’Neil’s chapter; Civilian Casualties: Justice in the Age of Big Data. O’Neil’s work puts forward the notion that the incorrect outputs of Big data trained by AI/ML algorithms can lead to inequalities in our societies. O’Neil used examples from politics, education and the business sectors to validate her argument. The important conclusion I had from this chapter is the fact that Big data processes “codify the past” but they do not invent the future. According to O’Neil, only human moral imagination is able to do so. She advocates for the necessity to provide neural nets algorithms with human moral values to be able to produce ethical Big data. The question here is, are the “big four” providers of Cloud services able to put equality ahead of their profits?
References
Bernardo A. Huberman, “Big Data and the Attention Economy” Ubiquity 2017, (December 2017).
Cathy O’Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (New York: Crown, 2016).
Jeffrey Johnson, Peter Denning, et al., Big Data, Digitization, and Social Change (Opening Statement), Ubiquity 2017, (December 2017).
Rob Kitchin, The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. London; Thousand Oaks, CA: SAGE Publications, 2014.