Autonomous Flight
In this work, we consider the task of depth estimation from a single image, with
the target application of obstacle avoidance and path planning for single-camera
equipped quadrotors. We use previously available training data, the NYU2 data
set, comprising of large numbers of images of indoor scenes, and their corresponding
depth-maps, and also assemble our own training data by walking around the ESAT
building with a Kinect sensor. We experiment with two feature extraction pipelines.
The first, uses traditional dense-SIFT features, extracted from within uniform regions
called superpixels, in an image. These dense-SIFT features are aggregated using
Fisher vector pooling. The second, uses the currently popular deep CNN features,
also extracted from within superpixels. Using these features and an SVM classifier,
superpixels are classified into close, medium and far, We demonstrate 80% accuracy
in the NYU2 data set, and achieve unto 99% accuracy when trained on the ESAT
dataset, and tested on the left-out parts of the same data set. In a practical
deployment, it is feasible to scan an environment once, train our algorithm on those
images and depth maps, and then use the trained SVMs to allow a quadrotor to fly
around that environment with a single forward-facing camera.