6,361 research outputs found
Crossing Generative Adversarial Networks for Cross-View Person Re-identification
Person re-identification (\textit{re-id}) refers to matching pedestrians
across disjoint yet non-overlapping camera views. The most effective way to
match these pedestrians undertaking significant visual variations is to seek
reliably invariant features that can describe the person of interest
faithfully. Most of existing methods are presented in a supervised manner to
produce discriminative features by relying on labeled paired images in
correspondence. However, annotating pair-wise images is prohibitively expensive
in labors, and thus not practical in large-scale networked cameras. Moreover,
seeking comparable representations across camera views demands a flexible model
to address the complex distributions of images. In this work, we study the
co-occurrence statistic patterns between pairs of images, and propose to
crossing Generative Adversarial Network (Cross-GAN) for learning a joint
distribution for cross-image representations in a unsupervised manner. Given a
pair of person images, the proposed model consists of the variational
auto-encoder to encode the pair into respective latent variables, a proposed
cross-view alignment to reduce the view disparity, and an adversarial layer to
seek the joint distribution of latent representations. The learned latent
representations are well-aligned to reflect the co-occurrence patterns of
paired images. We empirically evaluate the proposed model against challenging
datasets, and our results show the importance of joint invariant features in
improving matching rates of person re-id with comparison to semi/unsupervised
state-of-the-arts.Comment: 12 pages. arXiv admin note: text overlap with arXiv:1702.03431 by
other author
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification
RGB-Infrared (IR) person re-identification is very challenging due to the
large cross-modality variations between RGB and IR images. The key solution is
to learn aligned features to the bridge RGB and IR modalities. However, due to
the lack of correspondence labels between every pair of RGB and IR images, most
methods try to alleviate the variations with set-level alignment by reducing
the distance between the entire RGB and IR sets. However, this set-level
alignment may lead to misalignment of some instances, which limits the
performance for RGB-IR Re-ID. Different from existing methods, in this paper,
we propose to generate cross-modality paired-images and perform both global
set-level and fine-grained instance-level alignments. Our proposed method
enjoys several merits. First, our method can perform set-level alignment by
disentangling modality-specific and modality-invariant features. Compared with
conventional methods, ours can explicitly remove the modality-specific features
and the modality variation can be better reduced. Second, given cross-modality
unpaired-images of a person, our method can generate cross-modality paired
images from exchanged images. With them, we can directly perform instance-level
alignment by minimizing distances of every pair of images. Extensive
experimental results on two standard benchmarks demonstrate that the proposed
model favourably against state-of-the-art methods. Especially, on SYSU-MM01
dataset, our model can achieve a gain of 9.2% and 7.7% in terms of Rank-1 and
mAP. Code is available at https://github.com/wangguanan/JSIA-ReID.Comment: accepted by AAAI'2
- …