22,545 research outputs found
Temporal Model Adaptation for Person Re-Identification
Person re-identification is an open and challenging problem in computer
vision. Majority of the efforts have been spent either to design the best
feature representation or to learn the optimal matching metric. Most approaches
have neglected the problem of adapting the selected features or the learned
model over time. To address such a problem, we propose a temporal model
adaptation scheme with human in the loop. We first introduce a
similarity-dissimilarity learning method which can be trained in an incremental
fashion by means of a stochastic alternating directions methods of multipliers
optimization procedure. Then, to achieve temporal adaptation with limited human
effort, we exploit a graph-based approach to present the user only the most
informative probe-gallery matches that should be used to update the model.
Results on three datasets have shown that our approach performs on par or even
better than state-of-the-art approaches while reducing the manual pairwise
labeling effort by about 80%
Crossing Generative Adversarial Networks for Cross-View Person Re-identification
Person re-identification (\textit{re-id}) refers to matching pedestrians
across disjoint yet non-overlapping camera views. The most effective way to
match these pedestrians undertaking significant visual variations is to seek
reliably invariant features that can describe the person of interest
faithfully. Most of existing methods are presented in a supervised manner to
produce discriminative features by relying on labeled paired images in
correspondence. However, annotating pair-wise images is prohibitively expensive
in labors, and thus not practical in large-scale networked cameras. Moreover,
seeking comparable representations across camera views demands a flexible model
to address the complex distributions of images. In this work, we study the
co-occurrence statistic patterns between pairs of images, and propose to
crossing Generative Adversarial Network (Cross-GAN) for learning a joint
distribution for cross-image representations in a unsupervised manner. Given a
pair of person images, the proposed model consists of the variational
auto-encoder to encode the pair into respective latent variables, a proposed
cross-view alignment to reduce the view disparity, and an adversarial layer to
seek the joint distribution of latent representations. The learned latent
representations are well-aligned to reflect the co-occurrence patterns of
paired images. We empirically evaluate the proposed model against challenging
datasets, and our results show the importance of joint invariant features in
improving matching rates of person re-id with comparison to semi/unsupervised
state-of-the-arts.Comment: 12 pages. arXiv admin note: text overlap with arXiv:1702.03431 by
other author
An Evaluation of Deep CNN Baselines for Scene-Independent Person Re-Identification
In recent years, a variety of proposed methods based on deep convolutional
neural networks (CNNs) have improved the state of the art for large-scale
person re-identification (ReID). While a large number of optimizations and
network improvements have been proposed, there has been relatively little
evaluation of the influence of training data and baseline network architecture.
In particular, it is usually assumed either that networks are trained on
labeled data from the deployment location (scene-dependent), or else adapted
with unlabeled data, both of which complicate system deployment. In this paper,
we investigate the feasibility of achieving scene-independent person ReID by
forming a large composite dataset for training. We present an in-depth
comparison of several CNN baseline architectures for both scene-dependent and
scene-independent ReID, across a range of training dataset sizes. We show that
scene-independent ReID can produce leading-edge results, competitive with
unsupervised domain adaption techniques. Finally, we introduce a new dataset
for comparing within-camera and across-camera person ReID.Comment: To be published in 2018 15th Conference on Computer and Robot Vision
(CRV
- …