4,718 research outputs found
What makes for effective detection proposals?
Current top performing object detectors employ detection proposals to guide
the search for objects, thereby avoiding exhaustive sliding window search
across images. Despite the popularity and widespread use of detection
proposals, it is unclear which trade-offs are made when using them during
object detection. We provide an in-depth analysis of twelve proposal methods
along with four baselines regarding proposal repeatability, ground truth
annotation recall on PASCAL, ImageNet, and MS COCO, and their impact on DPM,
R-CNN, and Fast R-CNN detection performance. Our analysis shows that for object
detection improving proposal localisation accuracy is as important as improving
recall. We introduce a novel metric, the average recall (AR), which rewards
both high recall and good localisation and correlates surprisingly well with
detection performance. Our findings show common strengths and weaknesses of
existing methods, and provide insights and metrics for selecting and tuning
proposal methods.Comment: TPAMI final version, duplicate proposals removed in experiment
How good are detection proposals, really?
Current top performing Pascal VOC object detectors employ detection proposals
to guide the search for objects thereby avoiding exhaustive sliding window
search across images. Despite the popularity of detection proposals, it is
unclear which trade-offs are made when using them during object detection. We
provide an in depth analysis of ten object proposal methods along with four
baselines regarding ground truth annotation recall (on Pascal VOC 2007 and
ImageNet 2013), repeatability, and impact on DPM detector performance. Our
findings show common weaknesses of existing methods, and provide insights to
choose the most adequate method for different settings
Semantic Image Retrieval via Active Grounding of Visual Situations
We describe a novel architecture for semantic image retrieval---in
particular, retrieval of instances of visual situations. Visual situations are
concepts such as "a boxing match," "walking the dog," "a crowd waiting for a
bus," or "a game of ping-pong," whose instantiations in images are linked more
by their common spatial and semantic structure than by low-level visual
similarity. Given a query situation description, our architecture---called
Situate---learns models capturing the visual features of expected objects as
well the expected spatial configuration of relationships among objects. Given a
new image, Situate uses these models in an attempt to ground (i.e., to create a
bounding box locating) each expected component of the situation in the image
via an active search procedure. Situate uses the resulting grounding to compute
a score indicating the degree to which the new image is judged to contain an
instance of the situation. Such scores can be used to rank images in a
collection as part of a retrieval system. In the preliminary study described
here, we demonstrate the promise of this system by comparing Situate's
performance with that of two baseline methods, as well as with a related
semantic image-retrieval system based on "scene graphs.
Sequential Optimization for Efficient High-Quality Object Proposal Generation
We are motivated by the need for a generic object proposal generation
algorithm which achieves good balance between object detection recall, proposal
localization quality and computational efficiency. We propose a novel object
proposal algorithm, BING++, which inherits the virtue of good computational
efficiency of BING but significantly improves its proposal localization
quality. At high level we formulate the problem of object proposal generation
from a novel probabilistic perspective, based on which our BING++ manages to
improve the localization quality by employing edges and segments to estimate
object boundaries and update the proposals sequentially. We propose learning
the parameters efficiently by searching for approximate solutions in a
quantized parameter space for complexity reduction. We demonstrate the
generalization of BING++ with the same fixed parameters across different object
classes and datasets. Empirically our BING++ can run at half speed of BING on
CPU, but significantly improve the localization quality by 18.5% and 16.7% on
both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other
state-of-the-art approaches, BING++ can achieve comparable performance, but run
significantly faster.Comment: Accepted by TPAM
Sequential optimization for efficient high-quality object proposal generation
We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency. We propose a novel object proposal algorithm, BING ++, which inherits the virtue of good computational efficiency of BING [1] but significantly improves its proposal localization quality. At high level we formulate the problem of object proposal generation from a novel probabilistic perspective, based on which our BING++ manages to improve the localization quality by employing edges and segments to estimate object boundaries and update the proposals sequentially. We propose learning the parameters efficiently by searching for approximate solutions in a quantized parameter space for complexity reduction. We demonstrate the generalization of BING++ with the same fixed parameters across different object classes and datasets. Empirically our BING++ can run at half speed of BING on CPU, but significantly improve the localization quality by 18.5 and 16.7 percent on both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other state-of-the-art approaches, BING++ can achieve comparable performance, but run significantly faster
- …