101,863 research outputs found
Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning
Object category localization is a challenging problem in computer vision.
Standard supervised training requires bounding box annotations of object
instances. This time-consuming annotation process is sidestepped in weakly
supervised learning. In this case, the supervised information is restricted to
binary labels that indicate the absence/presence of object instances in the
image, without their locations. We follow a multiple-instance learning approach
that iteratively trains the detector and infers the object locations in the
positive training images. Our main contribution is a multi-fold multiple
instance learning procedure, which prevents training from prematurely locking
onto erroneous object locations. This procedure is particularly important when
using high-dimensional representations, such as Fisher vectors and
convolutional neural network features. We also propose a window refinement
method, which improves the localization accuracy by incorporating an objectness
prior. We present a detailed experimental evaluation using the PASCAL VOC 2007
dataset, which verifies the effectiveness of our approach.Comment: To appear in IEEE Transactions on Pattern Analysis and Machine
Intelligence (TPAMI
Object detection via a multi-region & semantic segmentation-aware CNN model
We propose an object detection system that relies on a multi-region deep
convolutional neural network (CNN) that also encodes semantic
segmentation-aware features. The resulting CNN-based representation aims at
capturing a diverse set of discriminative appearance factors and exhibits
localization sensitivity that is essential for accurate object localization. We
exploit the above properties of our recognition module by integrating it on an
iterative localization mechanism that alternates between scoring a box proposal
and refining its location with a deep CNN regression model. Thanks to the
efficient use of our modules, we detect objects with very high localization
accuracy. On the detection challenges of PASCAL VOC2007 and PASCAL VOC2012 we
achieve mAP of 78.2% and 73.9% correspondingly, surpassing any other published
work by a significant margin.Comment: Extended technical report -- short version to appear at ICCV 201
Backtracking Spatial Pyramid Pooling (SPP)-based Image Classifier for Weakly Supervised Top-down Salient Object Detection
Top-down saliency models produce a probability map that peaks at target
locations specified by a task/goal such as object detection. They are usually
trained in a fully supervised setting involving pixel-level annotations of
objects. We propose a weakly supervised top-down saliency framework using only
binary labels that indicate the presence/absence of an object in an image.
First, the probabilistic contribution of each image region to the confidence of
a CNN-based image classifier is computed through a backtracking strategy to
produce top-down saliency. From a set of saliency maps of an image produced by
fast bottom-up saliency approaches, we select the best saliency map suitable
for the top-down task. The selected bottom-up saliency map is combined with the
top-down saliency map. Features having high combined saliency are used to train
a linear SVM classifier to estimate feature saliency. This is integrated with
combined saliency and further refined through a multi-scale
superpixel-averaging of saliency map. We evaluate the performance of the
proposed weakly supervised topdown saliency and achieve comparable performance
with fully supervised approaches. Experiments are carried out on seven
challenging datasets and quantitative results are compared with 40 closely
related approaches across 4 different applications.Comment: 14 pages, 7 figure
Detection-by-Localization: Maintenance-Free Change Object Detector
Recent researches demonstrate that self-localization performance is a very
useful measure of likelihood-of-change (LoC) for change detection. In this
paper, this "detection-by-localization" scheme is studied in a novel
generalized task of object-level change detection. In our framework, a given
query image is segmented into object-level subimages (termed "scene parts"),
which are then converted to subimage-level pixel-wise LoC maps via the
detection-by-localization scheme. Our approach models a self-localization
system as a ranking function, outputting a ranked list of reference images,
without requiring relevance score. Thanks to this new setting, we can
generalize our approach to a broad class of self-localization systems. Our
ranking based self-localization model allows to fuse self-localization results
from different modalities via an unsupervised rank fusion derived from a field
of multi-modal information retrieval (MMR).Comment: 7 pages, 3 figures, Technical repor
Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction
Object detection systems based on the deep convolutional neural network (CNN)
have recently made ground- breaking advances on several object detection
benchmarks. While the features learned by these high-capacity neural networks
are discriminative for categorization, inaccurate localization is still a major
source of error for detection. Building upon high-capacity CNN architectures,
we address the localization problem by 1) using a search algorithm based on
Bayesian optimization that sequentially proposes candidate regions for an
object bounding box, and 2) training the CNN with a structured loss that
explicitly penalizes the localization inaccuracy. In experiments, we
demonstrated that each of the proposed methods improves the detection
performance over the baseline method on PASCAL VOC 2007 and 2012 datasets.
Furthermore, two methods are complementary and significantly outperform the
previous state-of-the-art when combined.Comment: CVPR 201
- …