1,516 research outputs found
Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification
RGB-Infrared (IR) person re-identification is very challenging due to the
large cross-modality variations between RGB and IR images. The key solution is
to learn aligned features to the bridge RGB and IR modalities. However, due to
the lack of correspondence labels between every pair of RGB and IR images, most
methods try to alleviate the variations with set-level alignment by reducing
the distance between the entire RGB and IR sets. However, this set-level
alignment may lead to misalignment of some instances, which limits the
performance for RGB-IR Re-ID. Different from existing methods, in this paper,
we propose to generate cross-modality paired-images and perform both global
set-level and fine-grained instance-level alignments. Our proposed method
enjoys several merits. First, our method can perform set-level alignment by
disentangling modality-specific and modality-invariant features. Compared with
conventional methods, ours can explicitly remove the modality-specific features
and the modality variation can be better reduced. Second, given cross-modality
unpaired-images of a person, our method can generate cross-modality paired
images from exchanged images. With them, we can directly perform instance-level
alignment by minimizing distances of every pair of images. Extensive
experimental results on two standard benchmarks demonstrate that the proposed
model favourably against state-of-the-art methods. Especially, on SYSU-MM01
dataset, our model can achieve a gain of 9.2% and 7.7% in terms of Rank-1 and
mAP. Code is available at https://github.com/wangguanan/JSIA-ReID.Comment: accepted by AAAI'2
Dual Gaussian-based Variational Subspace Disentanglement for Visible-Infrared Person Re-Identification
Visible-infrared person re-identification (VI-ReID) is a challenging and
essential task in night-time intelligent surveillance systems. Except for the
intra-modality variance that RGB-RGB person re-identification mainly overcomes,
VI-ReID suffers from additional inter-modality variance caused by the inherent
heterogeneous gap. To solve the problem, we present a carefully designed dual
Gaussian-based variational auto-encoder (DG-VAE), which disentangles an
identity-discriminable and an identity-ambiguous cross-modality feature
subspace, following a mixture-of-Gaussians (MoG) prior and a standard Gaussian
distribution prior, respectively. Disentangling cross-modality
identity-discriminable features leads to more robust retrieval for VI-ReID. To
achieve efficient optimization like conventional VAE, we theoretically derive
two variational inference terms for the MoG prior under the supervised setting,
which not only restricts the identity-discriminable subspace so that the model
explicitly handles the cross-modality intra-identity variance, but also enables
the MoG distribution to avoid posterior collapse. Furthermore, we propose a
triplet swap reconstruction (TSR) strategy to promote the above disentangling
process. Extensive experiments demonstrate that our method outperforms
state-of-the-art methods on two VI-ReID datasets.Comment: Accepted by ACM MM 2020 poster. 12 pages, 10 appendixe
- …