Analysis of Karpathy’s Article Key Points – Heba Khashogji

Machine Learning (ML) and Deep Learning (DL) can be used to analyze a tremendous number of images, extract useful information and make decisions about them (Machine_Learning&Artificial_Intelligence, 2017) like classifying E-mails, recommending videos, diseases prediction, recognizing handwriting ((LAB):CrashCourseAi#5, 2019) etc. ML gives computers the ability to extract high-level understanding from digital images (CrashCourseComputerScience#35, 2017).

The first appearance of such a model back in 1993, but the first actual use was in 2012 due to the GPUs’ development and the massive increase in data sizes (ImageNet, for example) (Karpathy, 2015).

ConvNet takes a 256x256x3 image as input and produces a probability of each output (class). The class with the highest probability will be chosen. At each layer, ConvNet performs convolution using filters, getting information like edges, color, etc. (CrashCourseComputerScience#35, 2017). More complex features will be extracted when we go deeper and deeper into the network. At the training process, filters are initialized randomly and trained until the network learns to match the image with the correct class (Karpathy, 2015). The training process of a deep network is complicated and takes much more time than traditional ones. Still, the accuracy is much better than the deep networks’ ability to handle massive data (ALPAYDIN, 2016).

Karpathy ConvNet to Classify Selfie Images

Karpathy applied the following vital steps to classify selfie images into good and bad:

  1. Gathering images tagged with #Selfie word (5 million images).
  2. Organizing the dataset: Karpathy divided the dataset into 1-million good and 1-million bad selfies based on some factors like the number of people that have seen the selfie, number of likes, number of followers and number of tags. 100-based groups were stored as good selfies while the rest ones stored as bad ones.
  3. Training: Karpathy selected the VGGNet pre-trained model and used Caffe to train it on the collected selfie dataset. ConvNet tuned its filters in a way that best allows the separation of the good and bad selfies under a well-known method called supervised learning (Dougherty, 2013).
  4. Results: The author selected the best 100 selfies out of 50000 selected by ConvNet. He introduced some advice to take a good selfie based on ConvNet results like females occupying about 30% of the image, cutting off the forehand, showing long hair, etc. He concluded that the style of the image was the key feature to make a good selfie.
  5. Extensions: The author also performed three different tasks; the first was the classification of celebrities’ selfies. Although there were specific factors to select the best selfies, oppose examples like including men and illumination problems appeared in some of the best selfies. The second task was to apply the t-SNE algorithm taking images and making some clustering by grouping them into categories based on similar conditions like the L2 norm. Results showed clusters like sunglasses, full-parts and mirror-included. The third task was to discover the best crop of a selfie. Karpathy randomly cropped image and introduced fragments to ConvNet, which decided the best crop. He found that ConvNet prefers selfies with heads taking about 30% of the image and chops off the forehead.

In some cases, ConvNet selected rude crops. Karpathy inserted a spatial transformation layer before the ConvNet and backpropped into six parameters defining an arbitrary crop. This extension didn’t work well. It sometimes was stuck. He also tried to constraint the transform, but it wasn’t helpful. The good news is that no global search is needed if the transform has three bounded parameters (Karpathy, 2015).

  1. Availability: Anyone on Twitter can use the “deepself” bot designed by karpathy to analyze his/her selfie and get the score of goodness his/her selfie is.

References: Link:

(LAB):CrashCourseAi#5. (2019). Retrieved from YouTube: https://www.youtube.com/watch?list=PL8dPuuaLjXtO65LeD2p4_Sb5XQ51par_b&t=67&v=6nGCGYWMObE&feature=youtu.be

ALPAYDIN, E. (2016). Machine Learning: The New Al . Cambridge: Massachusetts Institute of Technology.

CrashCourseComputerScience#35. (2017). Retrieved from Youtube: https://www.youtube.com/watch?v=-4E2-0sxVUM

Dougherty, G. (2013). Pattern Recognition and Classification. New York: Springer Science+Business Media.

Karpathy, A. (2015). https://karpathy.github.io/2015/10/25/selfie/. Retrieved 2020, from karpathy.github.io/2015/10/25/selfie

Machine_Learning&Artificial_Intelligence. (2017). Machine Learning & Artificial Intelligence. Retrieved from YouTube: https://www.youtube.com/watch?v=z-EtmaFJieY&t=2s

This entry was posted in Week 6 on by .

About Heba Khashogji

As a true believer in the seeds of obedience that blossom in our lives my life found happiness in honoring my parents. This leads me to the passion I’ve been fulfilling, to be an agent of change both in the corporate and societal environment. I advocate to work on social services to create and promote equity, opportunity and improvement of the people and the community. I offer more than a decade of experience and accomplishment in human resource, driving implementation in employee development, quality management systems, salary standardization, compensation and benefits management, personnel services management and company reorganization and realignment. One of my achievements is the creation of a quality management procedures and policies as an strategic and tactical efforts that drove our company, Khashoggi Holding Company in its International recognition as Quality Crown Gold Awardee in 2014. Going back, when I started working as a volunteer accountant/admin to setup Dar AlHekma College, the first private college for ladies in the Saudi Arabia and my first official career in King Fahad Armed Forces Hospital, I developed an interest in human relations and developed this interest into my participation to the implementation of quality management and standardization of policy management systems in these organizations. Demonstrating initiative in the start, I applied and implemented integration programs in Personnel Section leading to employees' satisfaction by delivering fair and reasonable benefits to all. Throughout my career, I had the opportunity to establish a strong network contacts in and out of the country through my active participation in several seminars and workshops. The scope of my experience has spanned practically in all aspects of HR as well as leadership. Another passion I am in love with is the aiding to the propagation of young Saudi generation be with better traits and characters created children books, converted to animated videos shown in local TV channels to help reinforcing behavioral change in the Arab region bringing them to be more well-mannered individuals and be more diplomatic among them as well as with their foreign friends exercising tact and courtesy in every encounter. Just recently, another 2 things in my wish list are achieved, to skydive and take Master course. Skydiving made me challenge myself and conquer my fears that can help me overcome obstacles in my future. I am not stopping to dream and I am not stopping to learn. I still see myself in a class, for 23 years from now, physical or virtual. I thirst for knowledge and I always crave for new ideas not even in the time of pandemic.