4,950 research outputs found

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    Sequential Optimization for Efficient High-Quality Object Proposal Generation

    Full text link
    We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency. We propose a novel object proposal algorithm, BING++, which inherits the virtue of good computational efficiency of BING but significantly improves its proposal localization quality. At high level we formulate the problem of object proposal generation from a novel probabilistic perspective, based on which our BING++ manages to improve the localization quality by employing edges and segments to estimate object boundaries and update the proposals sequentially. We propose learning the parameters efficiently by searching for approximate solutions in a quantized parameter space for complexity reduction. We demonstrate the generalization of BING++ with the same fixed parameters across different object classes and datasets. Empirically our BING++ can run at half speed of BING on CPU, but significantly improve the localization quality by 18.5% and 16.7% on both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other state-of-the-art approaches, BING++ can achieve comparable performance, but run significantly faster.Comment: Accepted by TPAM

    ELASTIC: Improving CNNs with Dynamic Scaling Policies

    Full text link
    Scale variation has been a challenge from traditional to modern approaches in computer vision. Most solutions to scale issues have a similar theme: a set of intuitive and manually designed policies that are generic and fixed (e.g. SIFT or feature pyramid). We argue that the scaling policy should be learned from data. In this paper, we introduce ELASTIC, a simple, efficient and yet very effective approach to learn a dynamic scale policy from data. We formulate the scaling policy as a non-linear function inside the network's structure that (a) is learned from data, (b) is instance specific, (c) does not add extra computation, and (d) can be applied on any network architecture. We applied ELASTIC to several state-of-the-art network architectures and showed consistent improvement without extra (sometimes even lower) computation on ImageNet classification, MSCOCO multi-label classification, and PASCAL VOC semantic segmentation. Our results show major improvement for images with scale challenges. Our code is available here: https://github.com/allenai/elasticComment: CVPR 2019 oral, code available https://github.com/allenai/elasti

    Sequential optimization for efficient high-quality object proposal generation

    Full text link
    We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency. We propose a novel object proposal algorithm, BING ++, which inherits the virtue of good computational efficiency of BING [1] but significantly improves its proposal localization quality. At high level we formulate the problem of object proposal generation from a novel probabilistic perspective, based on which our BING++ manages to improve the localization quality by employing edges and segments to estimate object boundaries and update the proposals sequentially. We propose learning the parameters efficiently by searching for approximate solutions in a quantized parameter space for complexity reduction. We demonstrate the generalization of BING++ with the same fixed parameters across different object classes and datasets. Empirically our BING++ can run at half speed of BING on CPU, but significantly improve the localization quality by 18.5 and 16.7 percent on both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other state-of-the-art approaches, BING++ can achieve comparable performance, but run significantly faster
    • …
    corecore