6 research outputs found

    DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection

    Full text link
    Although YOLOv2 approach is extremely fast on object detection; its backbone network has the low ability on feature extraction and fails to make full use of multi-scale local region features, which restricts the improvement of object detection accuracy. Therefore, this paper proposed a DC-SPP-YOLO (Dense Connection and Spatial Pyramid Pooling Based YOLO) approach for ameliorating the object detection accuracy of YOLOv2. Specifically, the dense connection of convolution layers is employed in the backbone network of YOLOv2 to strengthen the feature extraction and alleviate the vanishing-gradient problem. Moreover, an improved spatial pyramid pooling is introduced to pool and concatenate the multi-scale local region features, so that the network can learn the object features more comprehensively. The DC-SPP-YOLO model is established and trained based on a new loss function composed of mean square error and cross entropy, and the object detection is realized. Experiments demonstrate that the mAP (mean Average Precision) of DC-SPP-YOLO proposed on PASCAL VOC datasets and UA-DETRAC datasets is higher than that of YOLOv2; the object detection accuracy of DC-SPP-YOLO is superior to YOLOv2 by strengthening feature extraction and using the multi-scale local region features.Comment: 23 pages, 9 figures, 9 table

    Pelee: A Real-Time Object Detection System on Mobile Devices

    Get PDF
    There has been a rising interest in running high-quality Convolutional Neural Network (CNN) models under strict constraints on memory and computational budget. A number of efficient architectures have been proposed in recent years, for example, MobileNet, ShuffleNet, and NASNet-A. However, all these architectures are heavily dependent on depthwise separable convolution which lacks efficient implementation in most deep learning frameworks. Meanwhile, there are few studies that combine efficient models with fast object detection algorithms. This research tries to explore the design of an efficient CNN architecture for both image classification tasks and object detection tasks. We propose an efficient architecture named PeleeNet, which is built with conventional convolution instead. On ImageNet ILSVRC 2012 dataset, our proposed PeleeNet achieves a higher accuracy by 0.6% and 11% lower computational cost than MobileNet, the state-of-the-art efficient architecture. It is also important to point out that PeleeNet is of only 66% of the model size of MobileNet and 1/49 size of VGG. We then propose a real-time object detection system on mobile devices. We combine PeleeNet with Single Shot MultiBox Detector (SSD) method and optimize the architecture for fast speed. Meanwhile, we port SSD to iOS and provide an optimized code implementation. Our proposed detection system, named Pelee, achieves 70.9% mAP on PASCAL VOC2007 dataset at the speed of 17 FPS on iPhone 6s and 23.6 FPS on iPhone 8. Compared to TinyYOLOv2, the most widely used computational efficient object detection system, our proposed Pelee is more accurate (70.9% vs. 57.1%), 2.88 times lower in computational cost and 2.92 times smaller in model size
    corecore