Multi-spectral object Re-identification (ReID) aims to retrieve specific
objects by leveraging complementary information from different image spectra.
It delivers great advantages over traditional single-spectral ReID in complex
visual environment. However, the significant distribution gap among different
image spectra poses great challenges for effective multi-spectral feature
representations. In addition, most of current Transformer-based ReID methods
only utilize the global feature of class tokens to achieve the holistic
retrieval, ignoring the local discriminative ones. To address the above issues,
we step further to utilize all the tokens of Transformers and propose a cyclic
token permutation framework for multi-spectral object ReID, dubbled TOP-ReID.
More specifically, we first deploy a multi-stream deep network based on vision
Transformers to preserve distinct information from different image spectra.
Then, we propose a Token Permutation Module (TPM) for cyclic multi-spectral
feature aggregation. It not only facilitates the spatial feature alignment
across different image spectra, but also allows the class token of each
spectrum to perceive the local details of other spectra. Meanwhile, we propose
a Complementary Reconstruction Module (CRM), which introduces dense token-level
reconstruction constraints to reduce the distribution gap across different
image spectra. With the above modules, our proposed framework can generate more
discriminative multi-spectral features for robust object ReID. Extensive
experiments on three ReID benchmarks (i.e., RGBNT201, RGBNT100 and MSVR310)
verify the effectiveness of our methods. The code is available at
https://github.com/924973292/TOP-ReID.Comment: This work is accepted by AAAI202