Many of the readings this week focus on the shifting definition of big data. To being with the more literal definition of “big,” digital data huge in volume (consists of terabytes or petabytes of data), high in velocity (being created in/near real-time), and diverse in variety in type (structured/unstructured in nature, temporally/spatially referenced) can be defined as big data. In addition, big data strives to capture entire populations or systems, making it exhaustive in scope, aims to be as detailed as possible, and relational in nature, allowing for the conjoining of different datasets, and is scalable (can expand in size rapidly) (Big Brother). Essentially, we try to record relevant data that we can combine with other potentially relevant data, all with the hopes of answering questions about populations/systems, which leads to the next definition of big data: one that is tasked with giving deep and new insights into human behavior. In this case, data is not “big” in its volume, velocity, or variety, but “big” in that in theory, huge amounts of data are available to anyone in the world over the internet (in reality there is private data) (Digitization).
With our goals of generating relevant insights in mind, data science steps in to produce these insights. Driven by practical problems, data science is required to transform big data into useful, valuable information and involves finding relevant data, data preparation, data analysis, and data visualization (Huberman). The applications-driven nature of data science means that visualization is extremely important for understanding the output of the applications stage and communicating the results to clients and stakeholders. Given the absurdly large and complex amount of data, data scientists tackle the scientific challenge of formulating methods to represent complex and entangled systems. Data scientists utilize big data every day to generate insights. In one example, it was discovered that people tend to tell lies on Facebook while their Google searches reflect deep personal truths (Huberman).
In our everyday life, we use technology in ways that add to big data and data scientists’ work. For example, almost every time we use social media, go online shopping, or surf the web, data is being collected on us. Our locations, spending histories, interests, and sometimes even information about our body types are collected. This data is sold to many different private companies with the hopes of generating more insights about human behavior (and in most cases, generating more money). In today’s day and age, big data is used in governments, the education system, media, the healthcare industry, and a variety of other places.