4,950 research outputs found
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Sequential Optimization for Efficient High-Quality Object Proposal Generation
We are motivated by the need for a generic object proposal generation
algorithm which achieves good balance between object detection recall, proposal
localization quality and computational efficiency. We propose a novel object
proposal algorithm, BING++, which inherits the virtue of good computational
efficiency of BING but significantly improves its proposal localization
quality. At high level we formulate the problem of object proposal generation
from a novel probabilistic perspective, based on which our BING++ manages to
improve the localization quality by employing edges and segments to estimate
object boundaries and update the proposals sequentially. We propose learning
the parameters efficiently by searching for approximate solutions in a
quantized parameter space for complexity reduction. We demonstrate the
generalization of BING++ with the same fixed parameters across different object
classes and datasets. Empirically our BING++ can run at half speed of BING on
CPU, but significantly improve the localization quality by 18.5% and 16.7% on
both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other
state-of-the-art approaches, BING++ can achieve comparable performance, but run
significantly faster.Comment: Accepted by TPAM
ELASTIC: Improving CNNs with Dynamic Scaling Policies
Scale variation has been a challenge from traditional to modern approaches in
computer vision. Most solutions to scale issues have a similar theme: a set of
intuitive and manually designed policies that are generic and fixed (e.g. SIFT
or feature pyramid). We argue that the scaling policy should be learned from
data. In this paper, we introduce ELASTIC, a simple, efficient and yet very
effective approach to learn a dynamic scale policy from data. We formulate the
scaling policy as a non-linear function inside the network's structure that (a)
is learned from data, (b) is instance specific, (c) does not add extra
computation, and (d) can be applied on any network architecture. We applied
ELASTIC to several state-of-the-art network architectures and showed consistent
improvement without extra (sometimes even lower) computation on ImageNet
classification, MSCOCO multi-label classification, and PASCAL VOC semantic
segmentation. Our results show major improvement for images with scale
challenges. Our code is available here: https://github.com/allenai/elasticComment: CVPR 2019 oral, code available https://github.com/allenai/elasti
Sequential optimization for efficient high-quality object proposal generation
We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency. We propose a novel object proposal algorithm, BING ++, which inherits the virtue of good computational efficiency of BING [1] but significantly improves its proposal localization quality. At high level we formulate the problem of object proposal generation from a novel probabilistic perspective, based on which our BING++ manages to improve the localization quality by employing edges and segments to estimate object boundaries and update the proposals sequentially. We propose learning the parameters efficiently by searching for approximate solutions in a quantized parameter space for complexity reduction. We demonstrate the generalization of BING++ with the same fixed parameters across different object classes and datasets. Empirically our BING++ can run at half speed of BING on CPU, but significantly improve the localization quality by 18.5 and 16.7 percent on both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other state-of-the-art approaches, BING++ can achieve comparable performance, but run significantly faster
- …