3,718 research outputs found
SIFTing the relevant from the irrelevant: Automatically detecting objects in training images
Many state-of-the-art object recognition systems rely on identifying the location of objects in images, in order to better learn its visual attributes. In this paper, we propose four simple yet powerful hybrid ROI detection methods (combining both local and global features), based on frequently occurring keypoints. We show that our methods demonstrate competitive performance in two different types of datasets, the Caltech101 dataset and the GRAZ-02 dataset, where the pairs of keypoint bounding box method achieved the best accuracies overall
Monocular SLAM Supported Object Recognition
In this work, we develop a monocular SLAM-aware object recognition system
that is able to achieve considerably stronger recognition performance, as
compared to classical object recognition systems that function on a
frame-by-frame basis. By incorporating several key ideas including multi-view
object proposals and efficient feature encoding methods, our proposed system is
able to detect and robustly recognize objects in its environment using a single
RGB camera in near-constant time. Through experiments, we illustrate the
utility of using such a system to effectively detect and recognize objects,
incorporating multiple object viewpoint detections into a unified prediction
hypothesis. The performance of the proposed recognition system is evaluated on
the UW RGB-D Dataset, showing strong recognition performance and scalable
run-time performance compared to current state-of-the-art recognition systems.Comment: Accepted to appear at Robotics: Science and Systems 2015, Rome, Ital
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
Existing deep convolutional neural networks (CNNs) require a fixed-size
(e.g., 224x224) input image. This requirement is "artificial" and may reduce
the recognition accuracy for the images or sub-images of an arbitrary
size/scale. In this work, we equip the networks with another pooling strategy,
"spatial pyramid pooling", to eliminate the above requirement. The new network
structure, called SPP-net, can generate a fixed-length representation
regardless of image size/scale. Pyramid pooling is also robust to object
deformations. With these advantages, SPP-net should in general improve all
CNN-based image classification methods. On the ImageNet 2012 dataset, we
demonstrate that SPP-net boosts the accuracy of a variety of CNN architectures
despite their different designs. On the Pascal VOC 2007 and Caltech101
datasets, SPP-net achieves state-of-the-art classification results using a
single full-image representation and no fine-tuning.
The power of SPP-net is also significant in object detection. Using SPP-net,
we compute the feature maps from the entire image only once, and then pool
features in arbitrary regions (sub-images) to generate fixed-length
representations for training the detectors. This method avoids repeatedly
computing the convolutional features. In processing test images, our method is
24-102x faster than the R-CNN method, while achieving better or comparable
accuracy on Pascal VOC 2007.
In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our
methods rank #2 in object detection and #3 in image classification among all 38
teams. This manuscript also introduces the improvement made for this
competition.Comment: This manuscript is the accepted version for IEEE Transactions on
Pattern Analysis and Machine Intelligence (TPAMI) 2015. See Changelo
Proposal Flow: Semantic Correspondences from Object Proposals
Finding image correspondences remains a challenging problem in the presence
of intra-class variations and large changes in scene layout. Semantic flow
methods are designed to handle images depicting different instances of the same
object or scene category. We introduce a novel approach to semantic flow,
dubbed proposal flow, that establishes reliable correspondences using object
proposals. Unlike prevailing semantic flow approaches that operate on pixels or
regularly sampled local regions, proposal flow benefits from the
characteristics of modern object proposals, that exhibit high repeatability at
multiple scales, and can take advantage of both local and geometric consistency
constraints among proposals. We also show that the corresponding sparse
proposal flow can effectively be transformed into a conventional dense flow
field. We introduce two new challenging datasets that can be used to evaluate
both general semantic flow techniques and region-based approaches such as
proposal flow. We use these benchmarks to compare different matching
algorithms, object proposals, and region features within proposal flow, to the
state of the art in semantic flow. This comparison, along with experiments on
standard datasets, demonstrates that proposal flow significantly outperforms
existing semantic flow methods in various settings.Comment: arXiv admin note: text overlap with arXiv:1511.0506
- …