714 research outputs found
End-to-End Integration of a Convolutional Network, Deformable Parts Model and Non-Maximum Suppression
Deformable Parts Models and Convolutional Networks each have achieved notable
performance in object detection. Yet these two approaches find their strengths
in complementary areas: DPMs are well-versed in object composition, modeling
fine-grained spatial relationships between parts; likewise, ConvNets are adept
at producing powerful image features, having been discriminatively trained
directly on the pixels. In this paper, we propose a new model that combines
these two approaches, obtaining the advantages of each. We train this model
using a new structured loss function that considers all bounding boxes within
an image, rather than isolated object instances. This enables the non-maximal
suppression (NMS) operation, previously treated as a separate post-processing
stage, to be integrated into the model. This allows for discriminative training
of our combined Convnet + DPM + NMS model in end-to-end fashion. We evaluate
our system on PASCAL VOC 2007 and 2011 datasets, achieving competitive results
on both benchmarks
Deformable Part-based Fully Convolutional Network for Object Detection
Existing region-based object detectors are limited to regions with fixed box
geometry to represent objects, even if those are highly non-rectangular. In
this paper we introduce DP-FCN, a deep model for object detection which
explicitly adapts to shapes of objects with deformable parts. Without
additional annotations, it learns to focus on discriminative elements and to
align them, and simultaneously brings more invariance for classification and
geometric information to refine localization. DP-FCN is composed of three main
modules: a Fully Convolutional Network to efficiently maintain spatial
resolution, a deformable part-based RoI pooling layer to optimize positions of
parts and build invariance, and a deformation-aware localization module
explicitly exploiting displacements of parts to improve accuracy of bounding
box regression. We experimentally validate our model and show significant
gains. DP-FCN achieves state-of-the-art performances of 83.1% and 80.9% on
PASCAL VOC 2007 and 2012 with VOC data only.Comment: Accepted to BMVC 2017 (oral
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
- …