Deep Learning and Image Classification
One of the areas of machine intelligence that has been more dramatically disrupted by the deep learning revolution is computer vision. For decades the field of computer vision has relied on carefully handcrafting features to improve the accuracy of algorithms, developing a rich theory and thousands of very domain-specific algorithms. With deep learning this has changed: given the right conditions, many computer vision tasks no longer require such careful feature crafting. Among such tasks, we have image classification: teaching a machine to recognize the category of an image from a given taxonomy.
How hard image classification really is? In 2013, Kaggle launched a competition to classify pictures of cats and dogs, providing 12,500 images of each. According to this paper, the state of the art algorithms was expected to get an accuracy of around 80%. It turns out that the accuracy, using deep learning, was over 98%. How is that even possible?
ImageNet: where it all started
One of the earliest successes of deep learning is the ImageNet challenge. The ImageNet dataset is a huge image library with over 1000 classes, curated by the initiative of Fei-Fei Li, from the University of Illinois in Urbana-Champaign. Launched in 2010, the ImageNet challenge is a competition using this data set for researchers to evaluate the quality of their algorithms. Around 2011, the error rate was 25%. In 2012, using a deep learning architecture known as AlexNet, it was possible to reduce the error rate to 16%. The architecture of this network has been used over and over in different domains, as it has proven to be very successful. It is also possible to fine-tune the trained network to adapt it to your application so that you don’t need to retrain it every time!
Of course, transfer learning is still in active development and there is a conflict of interests of sorts on the side of the cloud providers to make you believe you absolutely need their infrastructure. Nonetheless, we expect transfer learning to prevail in the long term and become more commonplace.
One of the most fascinating applications of computer vision and deep learning is autonomous driving. In a recent article published in Arxiv.org, NVIDIA researchers describe an end-to-end autonomous driving system. The resulting network architecture, a convolutional neural network (CNN) called PilotNet, is fed data collected on a real vehicle by a human driver. The data consists of steering angle and video images from the road. The motivation was to eliminate the need for creating hand-coded rules for the driving system, as the system is able to generate the necessary domain knowledge from the raw data. One striking feature is that the car is able to remain on the correct lane even when there are no marks. The development was done using an NVIDIA DevBox using Torch 7 for the training, and an NVIDIA DriveP X self-driving car computer for the driving. Once the network is trained, the car computer captures the image from a video feed and returns the correct steering angle.
Of course, NVIDIA is not alone. A startup called drive.ai, founded by deep learning experts from the Stanford University’s Artificial Intelligence Laboratory is working in the development of a completely autonomous vehicle as well, integrating deep learning from the beginning on the design.
In our book, R Deep Learning projects we show an example of image recognition applied to traffic sign detection. This is of course super important for autonomous driving, alongside many other features: counting the number of pedestrians and, most importantly, estimating correctly not only the current speed and position of an object but their most likely speed and position in the near future.
Autonomous driving for the poor man
You may not have a ton of data at hand, maybe not even a car on which to run experiments. But that does not mean that you should miss the fun. Udacity recently open sourced their autonomous car simulator, on which you can train your own car to drive! The simulator is built in Unity, so you need to install it first and be somewhat familiar with it to retrieve the data. But once this is done, it does not take a lot of code, nor a lot of time to start developing our own self-driving car, at least virtually. Or you can use also training data from Grand Theft Auto V to create your own self-driving algorithm.
Many other simulation tools are available, for instance, OpenAI has open source a simple environment for a car race, CarRacing-v0.