The project
A neural network learned how to drive a car by observing how I do it! :) I must say that it's one of the coolest projects that I have ever done. Udacity provided a simulator program where you had to drive a car for a while on two tracks to collect training data. Each sample consisted of a steering angle and images from three front-facing cameras.
Then, in the autonomous driving mode, you are given an image from the central camera and must send back an appropriate steering angle, such that the car does not go off-track.
An elegant solution to this problem was described in a paper by nVidia from April 2016. I managed to replicate it in the simulator. Not without issues, though. The key takeaways for me were:
- The importance of making sure that the training data sample is balanced. That is, making sure that some categories of steering angles are not over-represented.
- The importance of randomly jittering the input images. To quote another paper: "ConvNets architectures have built-in invariance to small translations, scaling and rotations. When a dataset does not naturally contain those deformations, adding them synthetically will yield more robust learning to potential deformations in the test set."
- Not over-using dropout.
The model needed to train for 35 epochs. Each epoch consisted of 24 batches of 2048 images with on-the-fly jittering. It took 104 seconds to process one epoch on Amazon's p2.xlarge instance and 826 seconds to do the same thing on my laptop. What took an hour on a Tesla K80 GPU would have taken my laptop over 8 hours.
Results
Below are some sample results. The driving is not very smooth, but I blame that on myself not being a good driving model ;) The second track is especially interesting, because it differs from the one that the network was trained on. Interestingly enough, a MacBook Air did no have enough juice to run both the simulator and the model, even though the model is fairly small. I ended up having to create an ssh tunnel to my Linux laptop.