696 research outputs found

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    Facial component-landmark detection with weakly-supervised LR-CNN

    Full text link
    © 2013 IEEE. In this paper, we propose a weakly supervised landmark-region-based convolutional neural network (LR-CNN) framework to detect facial component and landmark simultaneously. Most of the existing course-to-fine facial detectors fail to detect landmark accurately without lots of fully labeled data, which are costly to obtain. We can handle the task with a small amount of finely labeled data. First, deep convolutional generative adversarial networks are utilized to generate training samples with weak labels, as data preparation. Then, through weakly supervised learning, our LR-CNN model can be trained effectively with a small amount of finely labeled data and a large amount of generated weakly labeled data. Notably, our approach can handle the situation when large occlusion areas occur, as we localize visible facial components before predicting corresponding landmarks. Detecting unblocked components first helps us to focus on the informative area, resulting in a better performance. Additionally, to improve the performance of the above tasks, we design two models as follows: 1) we add AnchorAlign in the region proposal networks to accurately localize components and 2) we propose a two-branch model consisting classification branch and regression branch to detect landmark. Extensive evaluations on benchmark datasets indicate that our proposed approach is able to complete the multi-task facial detection and outperforms the state-of-the-art facial component and landmark detection algorithms

    Recent advances in deep learning for object detection

    Get PDF
    Object detection is a fundamental visual recognition problem in computer vision and has been widely studied in the past decades. Visual object detection aims to find objects of certain target classes with precise localization in a given image and assign each object instance a corresponding class label. Due to the tremendous successes of deep learning based image classification, object detection techniques using deep learning have been actively studied in recent years. In this paper, we give a comprehensive survey of recent advances in visual object detection with deep learning. By reviewing a large body of recent related work in literature, we systematically analyze the existing object detection frameworks and organize the survey into three major parts: (i) detection components, (ii) learning strategies, and (iii) applications & benchmarks. In the survey, we cover a variety of factors affecting the detection performance in detail, such as detector architectures, feature learning, proposal generation, sampling strategies, etc. Finally, we discuss several future directions to facilitate and spur future research for visual object detection with deep learning. Keywords: Object Detection, Deep Learning, Deep Convolutional Neural Network

    VSSA-NET: Vertical Spatial Sequence Attention Network for Traffic Sign Detection

    Full text link
    Although traffic sign detection has been studied for years and great progress has been made with the rise of deep learning technique, there are still many problems remaining to be addressed. For complicated real-world traffic scenes, there are two main challenges. Firstly, traffic signs are usually small size objects, which makes it more difficult to detect than large ones; Secondly, it is hard to distinguish false targets which resemble real traffic signs in complex street scenes without context information. To handle these problems, we propose a novel end-to-end deep learning method for traffic sign detection in complex environments. Our contributions are as follows: 1) We propose a multi-resolution feature fusion network architecture which exploits densely connected deconvolution layers with skip connections, and can learn more effective features for the small size object; 2) We frame the traffic sign detection as a spatial sequence classification and regression task, and propose a vertical spatial sequence attention (VSSA) module to gain more context information for better detection performance. To comprehensively evaluate the proposed method, we do experiments on several traffic sign datasets as well as the general object detection dataset and the results have shown the effectiveness of our proposed method
    • …