727 research outputs found
DeepBox: Learning Objectness with Convolutional Networks
Existing object proposal approaches use primarily bottom-up cues to rank
proposals, while we believe that objectness is in fact a high level construct.
We argue for a data-driven, semantic approach for ranking object proposals. Our
framework, which we call DeepBox, uses convolutional neural networks (CNNs) to
rerank proposals from a bottom-up method. We use a novel four-layer CNN
architecture that is as good as much larger networks on the task of evaluating
objectness while being much faster. We show that DeepBox significantly improves
over the bottom-up ranking, achieving the same recall with 500 proposals as
achieved by bottom-up methods with 2000. This improvement generalizes to
categories the CNN has never seen before and leads to a 4.5-point gain in
detection mAP. Our implementation achieves this performance while running at
260 ms per image.Comment: ICCV 2015 Camera-ready versio
Object-Proposal Evaluation Protocol is 'Gameable'
Object proposals have quickly become the de-facto pre-processing step in a
number of vision pipelines (for object detection, object discovery, and other
tasks). Their performance is usually evaluated on partially annotated datasets.
In this paper, we argue that the choice of using a partially annotated dataset
for evaluation of object proposals is problematic -- as we demonstrate via a
thought experiment, the evaluation protocol is 'gameable', in the sense that
progress under this protocol does not necessarily correspond to a "better"
category independent object proposal algorithm.
To alleviate this problem, we: (1) Introduce a nearly-fully annotated version
of PASCAL VOC dataset, which serves as a test-bed to check if object proposal
techniques are overfitting to a particular list of categories. (2) Perform an
exhaustive evaluation of object proposal methods on our introduced nearly-fully
annotated PASCAL dataset and perform cross-dataset generalization experiments;
and (3) Introduce a diagnostic experiment to detect the bias capacity in an
object proposal algorithm. This tool circumvents the need to collect a densely
annotated dataset, which can be expensive and cumbersome to collect. Finally,
we plan to release an easy-to-use toolbox which combines various publicly
available implementations of object proposal algorithms which standardizes the
proposal generation and evaluation so that new methods can be added and
evaluated on different datasets. We hope that the results presented in the
paper will motivate the community to test the category independence of various
object proposal methods by carefully choosing the evaluation protocol.Comment: 15 pages, 11 figures, 4 table
Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning
Object category localization is a challenging problem in computer vision.
Standard supervised training requires bounding box annotations of object
instances. This time-consuming annotation process is sidestepped in weakly
supervised learning. In this case, the supervised information is restricted to
binary labels that indicate the absence/presence of object instances in the
image, without their locations. We follow a multiple-instance learning approach
that iteratively trains the detector and infers the object locations in the
positive training images. Our main contribution is a multi-fold multiple
instance learning procedure, which prevents training from prematurely locking
onto erroneous object locations. This procedure is particularly important when
using high-dimensional representations, such as Fisher vectors and
convolutional neural network features. We also propose a window refinement
method, which improves the localization accuracy by incorporating an objectness
prior. We present a detailed experimental evaluation using the PASCAL VOC 2007
dataset, which verifies the effectiveness of our approach.Comment: To appear in IEEE Transactions on Pattern Analysis and Machine
Intelligence (TPAMI
How good are detection proposals, really?
Current top performing Pascal VOC object detectors employ detection proposals
to guide the search for objects thereby avoiding exhaustive sliding window
search across images. Despite the popularity of detection proposals, it is
unclear which trade-offs are made when using them during object detection. We
provide an in depth analysis of ten object proposal methods along with four
baselines regarding ground truth annotation recall (on Pascal VOC 2007 and
ImageNet 2013), repeatability, and impact on DPM detector performance. Our
findings show common weaknesses of existing methods, and provide insights to
choose the most adequate method for different settings
What makes for effective detection proposals?
Current top performing object detectors employ detection proposals to guide
the search for objects, thereby avoiding exhaustive sliding window search
across images. Despite the popularity and widespread use of detection
proposals, it is unclear which trade-offs are made when using them during
object detection. We provide an in-depth analysis of twelve proposal methods
along with four baselines regarding proposal repeatability, ground truth
annotation recall on PASCAL, ImageNet, and MS COCO, and their impact on DPM,
R-CNN, and Fast R-CNN detection performance. Our analysis shows that for object
detection improving proposal localisation accuracy is as important as improving
recall. We introduce a novel metric, the average recall (AR), which rewards
both high recall and good localisation and correlates surprisingly well with
detection performance. Our findings show common strengths and weaknesses of
existing methods, and provide insights and metrics for selecting and tuning
proposal methods.Comment: TPAMI final version, duplicate proposals removed in experiment
- …