213 research outputs found
The process of regular window partition and reverse.
The process of regular window partition and reverse.</p
Performance comparison of our method with baselines on the Market1501, DukeMTMC-reID and MSMT17 dataset.
Performance comparison of our method with baselines on the Market1501, DukeMTMC-reID and MSMT17 dataset.</p
Example 1 of ranking results.
Extracting rich feature representations is a key challenge in person re-identification (Re-ID) tasks. However, traditional Convolutional Neural Networks (CNN) based methods could ignore a part of information when processing local regions of person images, which leads to incomplete feature extraction. To this end, this paper proposes a person Re-ID method based on vision Transformer with hierarchical structure and window shifting. When extracting person image features, the hierarchical Transformer model is constructed by introducing the hierarchical construction method commonly used in CNN. Then, considering the importance of local information of person images for complete feature extraction, the self-attention calculation is performed by shifting within the window region. Finally, experiments on three standard datasets demonstrate the effectiveness and superiority of the proposed method.</div
The process of shifting window partition and reverse.
The process of shifting window partition and reverse.</p
Comparison of computation efficiency among different methods.
Comparison of computation efficiency among different methods.</p
Loss and top1 error curve with MSMT17 dataset.
Extracting rich feature representations is a key challenge in person re-identification (Re-ID) tasks. However, traditional Convolutional Neural Networks (CNN) based methods could ignore a part of information when processing local regions of person images, which leads to incomplete feature extraction. To this end, this paper proposes a person Re-ID method based on vision Transformer with hierarchical structure and window shifting. When extracting person image features, the hierarchical Transformer model is constructed by introducing the hierarchical construction method commonly used in CNN. Then, considering the importance of local information of person images for complete feature extraction, the self-attention calculation is performed by shifting within the window region. Finally, experiments on three standard datasets demonstrate the effectiveness and superiority of the proposed method.</div
Example 2 of ranking results.
Extracting rich feature representations is a key challenge in person re-identification (Re-ID) tasks. However, traditional Convolutional Neural Networks (CNN) based methods could ignore a part of information when processing local regions of person images, which leads to incomplete feature extraction. To this end, this paper proposes a person Re-ID method based on vision Transformer with hierarchical structure and window shifting. When extracting person image features, the hierarchical Transformer model is constructed by introducing the hierarchical construction method commonly used in CNN. Then, considering the importance of local information of person images for complete feature extraction, the self-attention calculation is performed by shifting within the window region. Finally, experiments on three standard datasets demonstrate the effectiveness and superiority of the proposed method.</div
Self-attention mechanism.
Extracting rich feature representations is a key challenge in person re-identification (Re-ID) tasks. However, traditional Convolutional Neural Networks (CNN) based methods could ignore a part of information when processing local regions of person images, which leads to incomplete feature extraction. To this end, this paper proposes a person Re-ID method based on vision Transformer with hierarchical structure and window shifting. When extracting person image features, the hierarchical Transformer model is constructed by introducing the hierarchical construction method commonly used in CNN. Then, considering the importance of local information of person images for complete feature extraction, the self-attention calculation is performed by shifting within the window region. Finally, experiments on three standard datasets demonstrate the effectiveness and superiority of the proposed method.</div
ROC curve with Market1501 dataset.
Extracting rich feature representations is a key challenge in person re-identification (Re-ID) tasks. However, traditional Convolutional Neural Networks (CNN) based methods could ignore a part of information when processing local regions of person images, which leads to incomplete feature extraction. To this end, this paper proposes a person Re-ID method based on vision Transformer with hierarchical structure and window shifting. When extracting person image features, the hierarchical Transformer model is constructed by introducing the hierarchical construction method commonly used in CNN. Then, considering the importance of local information of person images for complete feature extraction, the self-attention calculation is performed by shifting within the window region. Finally, experiments on three standard datasets demonstrate the effectiveness and superiority of the proposed method.</div
Overall model architecture.
Extracting rich feature representations is a key challenge in person re-identification (Re-ID) tasks. However, traditional Convolutional Neural Networks (CNN) based methods could ignore a part of information when processing local regions of person images, which leads to incomplete feature extraction. To this end, this paper proposes a person Re-ID method based on vision Transformer with hierarchical structure and window shifting. When extracting person image features, the hierarchical Transformer model is constructed by introducing the hierarchical construction method commonly used in CNN. Then, considering the importance of local information of person images for complete feature extraction, the self-attention calculation is performed by shifting within the window region. Finally, experiments on three standard datasets demonstrate the effectiveness and superiority of the proposed method.</div
- …
