48 research outputs found
Unsupervised Person Re-identification by Deep Learning Tracklet Association
© 2018, Springer Nature Switzerland AG. Most existing person re-identification (re-id) methods rely on supervised model learning on per-camera-pair manually labelled pairwise training data. This leads to poor scalability in practical re-id deployment due to the lack of exhaustive identity labelling of image positive and negative pairs for every camera pair. In this work, we address this problem by proposing an unsupervised re-id deep learning approach capable of incrementally discovering and exploiting the underlying re-id discriminative information from automatically generated person tracklet data from videos in an end-to-end model optimisation. We formulate a Tracklet Association Unsupervised Deep Learning (TAUDL) framework characterised by jointly learning per-camera (within-camera) tracklet association (labelling) and cross-camera tracklet correlation by maximising the discovery of most likely tracklet relationships across camera views. Extensive experiments demonstrate the superiority of the proposed TAUDL model over the state-of-the-art unsupervised and domain adaptation re-id methods using six person re-id benchmarking datasets
Temporal Continuity Based Unsupervised Learning for Person Re-Identification
Person re-identification (re-id) aims to match the same person from images
taken across multiple cameras. Most existing person re-id methods generally
require a large amount of identity labeled data to act as discriminative
guideline for representation learning. Difficulty in manually collecting
identity labeled data leads to poor adaptability in practical scenarios. To
overcome this problem, we propose an unsupervised center-based clustering
approach capable of progressively learning and exploiting the underlying re-id
discriminative information from temporal continuity within a camera. We call
our framework Temporal Continuity based Unsupervised Learning (TCUL).
Specifically, TCUL simultaneously does center based clustering of unlabeled
(target) dataset and fine-tunes a convolutional neural network (CNN)
pre-trained on irrelevant labeled (source) dataset to enhance discriminative
capability of the CNN for the target dataset. Furthermore, it exploits
temporally continuous nature of images within-camera jointly with spatial
similarity of feature maps across-cameras to generate reliable pseudo-labels
for training a re-identification model. As the training progresses, number of
reliable samples keep on growing adaptively which in turn boosts representation
ability of the CNN. Extensive experiments on three large-scale person re-id
benchmark datasets are conducted to compare our framework with state-of-the-art
techniques, which demonstrate superiority of TCUL over existing methods
Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting
For person re-identification, existing deep networks often focus on
representation learning. However, without transfer learning, the learned model
is fixed as is, which is not adaptable for handling various unseen scenarios.
In this paper, beyond representation learning, we consider how to formulate
person image matching directly in deep feature maps. We treat image matching as
finding local correspondences in feature maps, and construct query-adaptive
convolution kernels on the fly to achieve local matching. In this way, the
matching process and results are interpretable, and this explicit matching is
more generalizable than representation features to unseen scenarios, such as
unknown misalignments, pose or viewpoint changes. To facilitate end-to-end
training of this architecture, we further build a class memory module to cache
feature maps of the most recent samples of each class, so as to compute image
matching losses for metric learning. Through direct cross-dataset evaluation,
the proposed Query-Adaptive Convolution (QAConv) method gains large
improvements over popular learning methods (about 10%+ mAP), and achieves
comparable results to many transfer learning methods. Besides, a model-free
temporal cooccurrence based score weighting method called TLift is proposed,
which improves the performance to a further extent, achieving state-of-the-art
results in cross-dataset person re-identification. Code is available at
https://github.com/ShengcaiLiao/QAConv.Comment: This is the ECCV 2020 version, including the appendi