3 research outputs found

    Camera Based Object Detection for Indoor Scenes

    Get PDF
    This master thesis describes a practical implementation of a deep learning framework for object detection on the self-collected multiclass dataset. The research work presents multiple perspectives of the data collection, labelling, preprocessing and training popular object detection architectures. The challenges in the collection of multiclass object detection dataset from the indoor premises and annotation process are presented with possible solutions. The performance evaluations of the trained object detectors are measured in terms of precision, recall, F1-score, mAP and processing speed. We experimented multiple object detection architectures that were available on the TensorFlow object detection model zoo. The multiclass dataset collected from the indoor premises were used to train and evaluate the performance of modern convolutional object detection models. We studied two scenarios, (a) pretrained object detection model and (b) fine-tuned detection model on the self-collected multiclass dataset. The performance of fine-tuned object detectors was better than the pretrained detectors. From our experiment, we found that region based convolutional neural network architectures have superior detection accuracy on our dataset. Faster region-based convolutional neural network (RCNN) architecture with residual networks features extractor has the best detection accuracy. Single shot multi-box detector (SSD) models are comparatively less precise in detection. However, they are faster in computation and easier to deploy in mobile and embedded devices. It is found that the region-based fully convolutional network (RFCN) is the suitable alternative for multi-class object detection considering the speed/accuracy trade-offs

    Computer Assisted Image Labeling for Object Detection Using Deep Learning

    Get PDF
    Deep learning-based object detectors have shown outstanding performance with state-of-the-art results on public benchmarks. However, they typically consist of millions of parameters and require a large number of training samples to tune these parameters appropriately. These samples are labeled by human annotators, which is a tedious, time-consuming, and expensive process. Moreover, object detectors have high computational costs both for the training and inference phase. This dissertation considers these two aspects of training and deploying deep learning object detectors. First, we study data labeling for the training phase and the robustness of object detectors towards label noise. We classify possible label noise scenarios in 2D object detection and study the sensitivity of one-stage object detectors to label noise in the training phase. We then propose methods for efficient bounding box annotation by utilizing human-machine collaboration. Extensive experiments have been done to study an efficient and effective bounding box annotation scheme for deep learning object detectors. Additionally, we created an easy-to-use, medium-sized, multiclass, fully labeled object detection dataset from indoor premises and released it publicly for registration-free use. Second, we study the practical problem of object detection network deployment with an efficient implementation of the object detection network for applications such as facial analysis, human detection and tracking, and the path prediction of mobile objects on resource-limited devices. We implemented object detection in an image processing pipeline integrating with other tasks for multiple applications and studied the optimal design process. We present the details of the system-level design to incorporate a multitasking network efficiently with the proper system architecture design

    Camera Based Object Detection for Indoor Scenes

    Get PDF
    This master thesis describes a practical implementation of a deep learning framework for object detection on the self-collected multiclass dataset. The research work presents multiple perspectives of the data collection, labelling, preprocessing and training popular object detection architectures. The challenges in the collection of multiclass object detection dataset from the indoor premises and annotation process are presented with possible solutions. The performance evaluations of the trained object detectors are measured in terms of precision, recall, F1-score, mAP and processing speed. We experimented multiple object detection architectures that were available on the TensorFlow object detection model zoo. The multiclass dataset collected from the indoor premises were used to train and evaluate the performance of modern convolutional object detection models. We studied two scenarios, (a) pretrained object detection model and (b) fine-tuned detection model on the self-collected multiclass dataset. The performance of fine-tuned object detectors was better than the pretrained detectors. From our experiment, we found that region based convolutional neural network architectures have superior detection accuracy on our dataset. Faster region-based convolutional neural network (RCNN) architecture with residual networks features extractor has the best detection accuracy. Single shot multi-box detector (SSD) models are comparatively less precise in detection. However, they are faster in computation and easier to deploy in mobile and embedded devices. It is found that the region-based fully convolutional network (RFCN) is the suitable alternative for multi-class object detection considering the speed/accuracy trade-offs
    corecore