I think the most interesting thing for me this week, which was highlighted by the Karpathy article, is how much we need to adjust the data we input into a system before creating a machine learning model. A uniformity must exist with a reduction of noise to accurately embody the true answer to the problem we are attempting to solve.
This most acutely reminds me of a Roomba or a robotic vacuum. People, including myself, will sing the praises of owning one of these devices as it keeps the floor clean regularly without the need of external monitoring, to fulfill a role which I would normally take on as I have now two quite furry companions. The thing about the vacuums though is that I have to make sure everything that I don’t want it to run into is off the floor, that there is no water abounding, that there isn’t any string to get caught up in it’s gears or motors. Essentially I have made my apartment a system in which the robot vacuum can work in peace without any obstruction.
The same goes for this learning model which Karpathy uses. These photos are taken and cleaned in some way to create a uniformity which the system itself can understand and process, with anything deviating from that uniformity not being captured.
What does this tell us about the pictures? That the system knows how to pick out certain types of well done and popular made photos which people have uploaded on the internet. This is amazing! Though does it teach us anything about the photos themselves or how to take photos? No, as these are techniques that could be learned by studying photography and design.
What I worry about is not clean practices of learning by the messiness of the world. Something has to give, will the world become more orderly to accommodate the model? Or will the model eventually be strong enough to be able to handle the messiness of the world? I am sure it’s somewhere in between but it will be interesting where we will find ourselves on that spectrum.