Big Data has Big Problems and Even Bigger Solutions

Big data is something I know a lot about because when the term started to be popularized in Psychology I was enamored, like many others, for the potential of doing anything with such large data sets, and the promises of being able to find truths that would normally be out of reach. Years later, after many studies and attempts to utilize such data I find myself realizing that, aside from the hype, Big data was just like any other technique we utilize, giving us lots of information but not a ton of knowledge.

There are a few things with big data that are problematic. First is the ability to generate information and connection between two seemingly irrelevant things. Which ordinarily sounds amazing until you realize that the best and most effective application of these tools to date is to market and sell ads/products to you more effectively. Amazon and YouTube are great examples of this. It knows what you want to buy before you want to buy it or the video you want to watch before you knew you want to watch it. This also made it so Facebook could control algorithms to improve or depress mood of those who use their site. What these companies who use Big Data care about is the bottom line which leads to the next issue.

Those who use Big Data sometimes don’t understand the ramifications of the work that they do. There was a study I saw a number of years ago which used a data set of faces to see if they could identify faces of Gay men. This is intriguing but also highly invasive and controversial. This isn’t much of an issue outside of a proof of concept but in countries like Iran where being Gay is illegal using something like this system to determine the likelihood someone is Gay (regardless of the true accuracy) is terrible. Data needs to come with the responsibility to use it or else we will end up with scenarios where we do something we can’t easily undo and end up harming a large group of people.

This brings me to my last point, Big Data used responsibly takes a lot of effort. There is a new project happening called the Human Screenome Project which takes pictures of what is shown on your phone every 5 seconds. Amazing large data set which will reveal a lot about how people use their phone, but to even parse through the millions of pictures to derive the information for analysis will take years and thousands of hours. Big data is fantastic but not some easy shortcut just because it’s there. When used responsibly a lot of time and effort needs to go into understanding what exactly it tells you and how to interpret what you’ve found.

Human Screenome Project