5 research outputs found
End-to-End Integration of a Convolutional Network, Deformable Parts Model and Non-Maximum Suppression
Deformable Parts Models and Convolutional Networks each have achieved notable
performance in object detection. Yet these two approaches find their strengths
in complementary areas: DPMs are well-versed in object composition, modeling
fine-grained spatial relationships between parts; likewise, ConvNets are adept
at producing powerful image features, having been discriminatively trained
directly on the pixels. In this paper, we propose a new model that combines
these two approaches, obtaining the advantages of each. We train this model
using a new structured loss function that considers all bounding boxes within
an image, rather than isolated object instances. This enables the non-maximal
suppression (NMS) operation, previously treated as a separate post-processing
stage, to be integrated into the model. This allows for discriminative training
of our combined Convnet + DPM + NMS model in end-to-end fashion. We evaluate
our system on PASCAL VOC 2007 and 2011 datasets, achieving competitive results
on both benchmarks