Camera Based Object Detection for Indoor Scenes

Abstract

This master thesis describes a practical implementation of a deep learning framework for object detection on the self-collected multiclass dataset. The research work presents multiple perspectives of the data collection, labelling, preprocessing and training popular object detection architectures. The challenges in the collection of multiclass object detection dataset from the indoor premises and annotation process are presented with possible solutions. The performance evaluations of the trained object detectors are measured in terms of precision, recall, F1-score, mAP and processing speed. We experimented multiple object detection architectures that were available on the TensorFlow object detection model zoo. The multiclass dataset collected from the indoor premises were used to train and evaluate the performance of modern convolutional object detection models. We studied two scenarios, (a) pretrained object detection model and (b) fine-tuned detection model on the self-collected multiclass dataset. The performance of fine-tuned object detectors was better than the pretrained detectors. From our experiment, we found that region based convolutional neural network architectures have superior detection accuracy on our dataset. Faster region-based convolutional neural network (RCNN) architecture with residual networks features extractor has the best detection accuracy. Single shot multi-box detector (SSD) models are comparatively less precise in detection. However, they are faster in computation and easier to deploy in mobile and embedded devices. It is found that the region-based fully convolutional network (RFCN) is the suitable alternative for multi-class object detection considering the speed/accuracy trade-offs

    Similar works