Is Supervised Machine Learning Standardizing Our Selfies?

Group Name: The Georgetown AI 3

Group Members: Zach Omer, Beiyuan Gu, Annaliese Blank

Alpaydin – Machine Learning Notes – Pattern Recognition and Neural Networks

Chapter 3:

  • Captcha corrupted image of words or numbers that need to be typed to prove that the user is a human and not a computer (pg. 58).
  • Semi-parametric estimation model that maps the input to the output but is valid only locally, and for different type of inputs use different models (pg. 59).
  • Localizing data in order to increase complexity
  • Its best to use a simple method, similar inputs have similar outputs

Generative Models

  • These represent how our beliefs can be reflected through or based off the data we generate (pg. 60).
  • Character recognition identity and appearance
  • Invariance size does not affect the identity
  • The generative model is CAUSAL and explains how the data is generated by hidden factors that cause it (pg. 62).
  • Facial recognition
  • Affective computing adapts to the the mood of the user
  • Biometrics recognition of people with their physiological and behavioral characteristics


  • Inputs are used for decision making (pg.73)
  • Dimensionality reduction learning algorithms, both the complexity of the model and the training algorithm depends on the number of input attributes.
  • Time complexity – how much calculation to do?
  • Space complexity – how much memory we need
  • Decreasing the number of inputs always decreases the time and space, but how much they decrease depends on the particular model and learning algorithm
  • Smaller models are based on small data, which is trained to with fewer data
  • Archive dimensionality can be done in two ways: feature selection and feature extraction
    • Feature selection: process of subset selection where we want to choose the smaller subset of the set of input attributes leading to maximum performance
    • Feature extraction: new features that are calculated from the original features
  • Decision trees

Chapter 4:

  • Neutral Networks and Deep Learning:
    • Perception model
    • Learning algorithms adjust the connection weights between neurons (pg. 88).
    • Hebbian learning rule: the weight between two neurons get reinforced if the two are active at the same time – the synaptic weight effectively learns the correlation between the two neurons
    • Error function : the sum of the difference between the actual outputs the network estimates for an input and their required values specified by the supervisor
    • If we define the state of a network as the collection of the values of all the neurons at a certain time, recurrent connections allow the current state to depend not only on the current input but also the on the previous time steps calculated from the previous inputs.
    • SIMD, NIMD
    • Simple cells vs. complex cells
  • Deep Learning
    • Deep neural networks- each hidden layer combines the value in its preceding layer and learns complicated functions of the input.

In relation to the Karpathy article, this piece helped us understand how the data we produce and generate through a “selfie” can be unpacked and mechanically understood from an IT standpoint. 


Case Study – Karpathy – What a Deep Neural Network Thinks of Your #Selfie (notes)

Convolutional Neural Networks

  • The house numbers and street signs in the graphic remind me of those “prove you’re not a robot” activities that you have to do when logging in or creating an account on certain sites. Are those just collecting human input to enhance their ConvNet algorithms?!
  • ConvNets are
    • Simple (one operation, repeated a lot)
    • Fast (image processing in tens of milliseconds)
    • Effective (and function similar to the visual cortex in our own brains!)
    • A large collection of filters that are applied on top of each other
      • Initialized randomly, and trained over time
        • “Resembles showing a child many images of things, and him/her having to gradually figure out what to look for in the images to tell those things apart.”

The Selfie Experiment

  • Interesting experiment, but not all selfies?
    • Full body portraits, couples photographed by a third party, mirror selfies (are they the same as front-facing camera selfies?), soccer players (?), and King Joffrey from Game of Thrones


    • Was this human or computer error? Also, why was Joffrey ranked lower than the soccer players? His forehead is even cut off in the shot (which was one of the tips)!
  • Didn’t give a working definition of selfie in the article
  • Didn’t need an algorithm for most of the selfie advice (be female, put a filter/border on it, don’t take in low lighting, don’t frame your head too large, etc.)
    • Is this natural human bias showing through in the implementation of an algorithm that ranks selfies (a very human idea)?
  • Reflections on supervised learning
    • Supervised learning is a learning in which we train the machine using data that is well classified. According to its definition, the selfie experiment is an application of supervised learning as the researcher fed the machine with 2 million selfies that were pre-labeled as “good” or “bad” by his criterion.  
    • In our opinion, supervised learning is not very applicable in this experiment because “good” or “bad” is an ambiguous concept, which makes it more difficult to categorize 2 million selfies to these two categories. We believe the classification of the training data for supervised learning should be uncontroversial. For example, if you are planning to train a machine to recognize a toad, you need to feed it with a great number of pictures with toads. The pictures present either toads or non-toads. But in the selfie experiment, the classification of selfies as “good” or “bad” is not convincing. It is based on individual judgement, which shows significant uncertainty. Also, as pointed out above, there are some errors in the data. Therefore, it makes us reflect on the accountability of supervised learning in the cases where the training data is not well classified. And if it is true that “the more training data, the better,” does the quality of data matter?  
    • As we see it, unsupervised learning would be better in this selfie experiment. Unsupervised learning is a type of machine learning algorithm used to draw inferences from dataset consisting of input data without labeled responses.   

In this case, unsupervised learning can help to find interesting patterns in selfies so that we can see the distinctions between different clusters. It would be more inspiring.


  • “takes some number of things (e.g. images in our case) and lays them out in such way that nearby things are similar”
  • Visualized pattern recognition
  • Reminded of artwork that is made up of smaller images (mosaic art)

  • t-SNE is a kind of unsupervised learning.

Concluding Thought(s)

  • Everyone has their unique way of taking selfies. It’s a manifestation of our personality, our digital presence, our insecurities, our “brand.” While it’s fun to run algorithmic tests for pattern recognition and even to collect information on different ways of taking selfies, if a computer starts dictating what makes a selfie ‘good’ (a subjective term to begin with) we’re taking steps toward standardizing a semi-unique form of expression in the Digital Age. If everyone’s selfies start looking the same in an effort to ‘look better’ or get more likes, the world will lose some of its charm.
  • Can facial recognition security really be trusted if there are tens, hundreds, or thousands (for some) of our selfies out there on the web being data-mined for their facial properties? Maybe so, but that seems more accessible to hackers or identity thieves than fingerprints or passwords at this point in the Digital Age. 



Alpaydin, E. (2016). Machine learning: the new AI. Cambridge, MA: MIT Press.
Karpathy, A. (2015, October 25). What a Deep Neural Network think about your #selfie [Personal Blog]. Retrieved from