327,432 research outputs found
Intra-Camera Supervised Person Re-Identification
Existing person re-identification (re-id) methods mostly exploit a large set of cross-camera identity labelled training data. This requires a tedious data collection and annotation process, leading to poor scalability in practical re-id applications. On the other hand unsupervised re-id methods do not need identity label information, but they usually suffer from much inferior and insufficient model performance. To overcome these fundamental limitations, we propose a novel person re-identification paradigm based on an idea of independent per-camera identity annotation. This eliminates the most time-consuming and tedious inter-camera identity labelling process, significantly reducing the amount of human annotation efforts. Consequently, it gives rise to a more scalable and more feasible setting, which we call Intra-Camera Supervised (ICS) person re-id, for which we formulate a Multi-tAsk mulTi-labEl (MATE) deep learning method. Specifically, MATE is designed for self-discovering the cross-camera identity correspondence in a per-camera multi-task inference framework. Extensive experiments demonstrate the cost-effectiveness superiority of our method over the alternative approaches on three large person re-id datasets. For example, MATE yields 88.7% rank-1 score on Market-1501 in the proposed ICS person re-id setting, significantly outperforming unsupervised learning models and closely approaching conventional fully supervised learning competitors
Intra-Camera Supervised Person Re-Identification: A New Benchmark
Existing person re-identification (re-id) methods rely mostly on a large set
of inter-camera identity labelled training data, requiring a tedious data
collection and annotation process therefore leading to poor scalability in
practical re-id applications. To overcome this fundamental limitation, we
consider person re-identification without inter-camera identity association but
only with identity labels independently annotated within each individual
camera-view. This eliminates the most time-consuming and tedious inter-camera
identity labelling process in order to significantly reduce the amount of human
efforts required during annotation. It hence gives rise to a more scalable and
more feasible learning scenario, which we call Intra-Camera Supervised (ICS)
person re-id. Under this ICS setting with weaker label supervision, we
formulate a Multi-Task Multi-Label (MTML) deep learning method. Given no
inter-camera association, MTML is specially designed for self-discovering the
inter-camera identity correspondence. This is achieved by inter-camera
multi-label learning under a joint multi-task inference framework. In addition,
MTML can also efficiently learn the discriminative re-id feature
representations by fully using the available identity labels within each
camera-view. Extensive experiments demonstrate the performance superiority of
our MTML model over the state-of-the-art alternative methods on three
large-scale person re-id datasets in the proposed intra-camera supervised
learning setting.Comment: 9 pages, 3 figures, accepted by ICCV Workshop on Real-World
Recognition from Low-Quality Images and Videos, 201
Improving Person Re-identification by Attribute and Identity Learning
Person re-identification (re-ID) and attribute recognition share a common
target at learning pedestrian descriptions. Their difference consists in the
granularity. Most existing re-ID methods only take identity labels of
pedestrians into consideration. However, we find the attributes, containing
detailed local descriptions, are beneficial in allowing the re-ID model to
learn more discriminative feature representations. In this paper, based on the
complementarity of attribute labels and ID labels, we propose an
attribute-person recognition (APR) network, a multi-task network which learns a
re-ID embedding and at the same time predicts pedestrian attributes. We
manually annotate attribute labels for two large-scale re-ID datasets, and
systematically investigate how person re-ID and attribute recognition benefit
from each other. In addition, we re-weight the attribute predictions
considering the dependencies and correlations among the attributes. The
experimental results on two large-scale re-ID benchmarks demonstrate that by
learning a more discriminative representation, APR achieves competitive re-ID
performance compared with the state-of-the-art methods. We use APR to speed up
the retrieval process by ten times with a minor accuracy drop of 2.92% on
Market-1501. Besides, we also apply APR on the attribute recognition task and
demonstrate improvement over the baselines.Comment: Accepted to Pattern Recognition (PR
Occluded Person Re-identification
Person re-identification (re-id) suffers from a serious occlusion problem
when applied to crowded public places. In this paper, we propose to retrieve a
full-body person image by using a person image with occlusions. This differs
significantly from the conventional person re-id problem where it is assumed
that person images are detected without any occlusion. We thus call this new
problem the occluded person re-identitification. To address this new problem,
we propose a novel Attention Framework of Person Body (AFPB) based on deep
learning, consisting of 1) an Occlusion Simulator (OS) which automatically
generates artificial occlusions for full-body person images, and 2) multi-task
losses that force the neural network not only to discriminate a person's
identity but also to determine whether a sample is from the occluded data
distribution or the full-body data distribution. Experiments on a new occluded
person re-id dataset and three existing benchmarks modified to include
full-body person images and occluded person images show the superiority of the
proposed method.Comment: 6 pages, 7 figures, IEEE International Conference of Multimedia and
Expo 201
Learning Deep Context-aware Features over Body and Latent Parts for Person Re-identification
Person Re-identification (ReID) is to identify the same person across
different cameras. It is a challenging task due to the large variations in
person pose, occlusion, background clutter, etc How to extract powerful
features is a fundamental problem in ReID and is still an open problem today.
In this paper, we design a Multi-Scale Context-Aware Network (MSCAN) to learn
powerful features over full body and body parts, which can well capture the
local context knowledge by stacking multi-scale convolutions in each layer.
Moreover, instead of using predefined rigid parts, we propose to learn and
localize deformable pedestrian parts using Spatial Transformer Networks (STN)
with novel spatial constraints. The learned body parts can release some
difficulties, eg pose variations and background clutters, in part-based
representation. Finally, we integrate the representation learning processes of
full body and body parts into a unified framework for person ReID through
multi-class person identification tasks. Extensive evaluations on current
challenging large-scale person ReID datasets, including the image-based
Market1501, CUHK03 and sequence-based MARS datasets, show that the proposed
method achieves the state-of-the-art results.Comment: Accepted by CVPR 201
- …