Article thumbnail

Learning Detection with Diverse Proposals

By Samaneh Azadi, Jiashi Feng and Trevor Darrell

Abstract

To predict a set of diverse and informative proposals with enriched representations, this paper introduces a differentiable Determinantal Point Process (DPP) layer that is able to augment the object detection architectures. Most modern object detection architectures, such as Faster R-CNN, learn to localize objects by minimizing deviations from the ground-truth but ignore correlation between multiple proposals and object categories. Non-Maximum Suppression (NMS) as a widely used proposal pruning scheme ignores label- and instance-level relations between object candidates resulting in multi-labeled detections. In the multi-class case, NMS selects boxes with the largest prediction scores ignoring the semantic relation between categories of potential election. In contrast, our trainable DPP layer, allowing for Learning Detection with Diverse Proposals (LDDP), considers both label-level contextual information and spatial layout relationships between proposals without increasing the number of parameters of the network, and thus improves location and category specifications of final detected bounding boxes substantially during both training and inference schemes. Furthermore, we show that LDDP keeps it superiority over Faster R-CNN even if the number of proposals generated by LDPP is only ~30% as many as those for Faster R-CNN.Comment: Accepted to CVPR 201

Topics: Computer Science - Computer Vision and Pattern Recognition
Year: 2017
DOI identifier: 10.1109/cvpr.2017.779
OAI identifier: oai:arXiv.org:1704.03533

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.

Suggested articles