689 research outputs found
Iterative Object and Part Transfer for Fine-Grained Recognition
The aim of fine-grained recognition is to identify sub-ordinate categories in
images like different species of birds. Existing works have confirmed that, in
order to capture the subtle differences across the categories, automatic
localization of objects and parts is critical. Most approaches for object and
part localization relied on the bottom-up pipeline, where thousands of region
proposals are generated and then filtered by pre-trained object/part models.
This is computationally expensive and not scalable once the number of
objects/parts becomes large. In this paper, we propose a nonparametric
data-driven method for object and part localization. Given an unlabeled test
image, our approach transfers annotations from a few similar images retrieved
in the training set. In particular, we propose an iterative transfer strategy
that gradually refine the predicted bounding boxes. Based on the located
objects and parts, deep convolutional features are extracted for recognition.
We evaluate our approach on the widely-used CUB200-2011 dataset and a new and
large dataset called Birdsnap. On both datasets, we achieve better results than
many state-of-the-art approaches, including a few using oracle (manually
annotated) bounding boxes in the test images.Comment: To appear in ICME 2017 as an oral pape
Multi-scale Deep Learning Architectures for Person Re-identification
Person Re-identification (re-id) aims to match people across non-overlapping
camera views in a public space. It is a challenging problem because many people
captured in surveillance videos wear similar clothes. Consequently, the
differences in their appearance are often subtle and only detectable at the
right location and scales. Existing re-id models, particularly the recently
proposed deep learning based ones match people at a single scale. In contrast,
in this paper, a novel multi-scale deep learning model is proposed. Our model
is able to learn deep discriminative feature representations at different
scales and automatically determine the most suitable scales for matching. The
importance of different spatial locations for extracting discriminative
features is also learned explicitly. Experiments are carried out to demonstrate
that the proposed model outperforms the state-of-the art on a number of
benchmarksComment: 9 pages, 3 figures, accepted by ICCV 201
- …