Search CORE

343 research outputs found

Self-Supervised Gait Encoding with Locality-Aware Attention for Person Re-Identification

Author: Cheng Jun
Da Huang
Hu Bin
Hu Xiping
Rao Haocong
Tan Mingkui
Wang Siqi
Publication venue: 'International Joint Conferences on Artificial Intelligence'
Publication date: 21/08/2020
Field of study

Gait-based person re-identification (Re-ID) is valuable for safety-critical applications, and using only 3D skeleton data to extract discriminative gait features for person Re-ID is an emerging open topic. Existing methods either adopt hand-crafted features or learn gait features by traditional supervised learning paradigms. Unlike previous methods, we for the first time propose a generic gait encoding approach that can utilize unlabeled skeleton data to learn gait representations in a self-supervised manner. Specifically, we first propose to introduce self-supervision by learning to reconstruct input skeleton sequences in reverse order, which facilitates learning richer high-level semantics and better gait representations. Second, inspired by the fact that motion's continuity endows temporally adjacent skeletons with higher correlations ("locality"), we propose a locality-aware attention mechanism that encourages learning larger attention weights for temporally adjacent skeletons when reconstructing current skeleton, so as to learn locality when encoding gait. Finally, we propose Attention-based Gait Encodings (AGEs), which are built using context vectors learned by locality-aware attention, as final gait representations. AGEs are directly utilized to realize effective person Re-ID. Our approach typically improves existing skeleton-based methods by 10-20% Rank-1 accuracy, and it achieves comparable or even superior performance to multi-modal methods with extra RGB or depth information. Our codes are available at https://github.com/Kali-Hac/SGE-LA.Comment: Accepted at IJCAI 2020 Main Track. Sole copyright holder is IJCAI. Codes are available at https://github.com/Kali-Hac/SGE-L

arXiv.org e-Print Archive

Crossref

TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification

Author: Miao Chunyan
Rao Haocong
Publication venue
Publication date: 27/03/2023
Field of study

Person re-identification (re-ID) via 3D skeleton data is an emerging topic with prominent advantages. Existing methods usually design skeleton descriptors with raw body joints or perform skeleton sequence representation learning. However, they typically cannot concurrently model different body-component relations, and rarely explore useful semantics from fine-grained representations of body joints. In this paper, we propose a generic Transformer-based Skeleton Graph prototype contrastive learning (TranSG) approach with structure-trajectory prompted reconstruction to fully capture skeletal relations and valuable spatial-temporal semantics from skeleton graphs for person re-ID. Specifically, we first devise the Skeleton Graph Transformer (SGT) to simultaneously learn body and motion relations within skeleton graphs, so as to aggregate key correlative node features into graph representations. Then, we propose the Graph Prototype Contrastive learning (GPC) to mine the most typical graph features (graph prototypes) of each identity, and contrast the inherent similarity between graph representations and different prototypes from both skeleton and sequence levels to learn discriminative graph representations. Last, a graph Structure-Trajectory Prompted Reconstruction (STPR) mechanism is proposed to exploit the spatial and temporal contexts of graph nodes to prompt skeleton graph reconstruction, which facilitates capturing more valuable patterns and graph semantics for person re-ID. Empirical evaluations demonstrate that TranSG significantly outperforms existing state-of-the-art methods. We further show its generality under different graph modeling, RGB-estimated skeletons, and unsupervised scenarios.Comment: Accepted by CVPR 2023. Codes are available at https://github.com/Kali-Hac/TranSG. Supplemental material is included in the conference proceeding

arXiv.org e-Print Archive

Pose Uncertainty Aware Movement Synchrony Estimation via Spatial-Temporal Graph Transformer

Author: Barmaki Roghayeh
Bhat Anjana
Li Jicheng
Publication venue
Publication date: 01/08/2022
Field of study

Movement synchrony reflects the coordination of body movements between interacting dyads. The estimation of movement synchrony has been automated by powerful deep learning models such as transformer networks. However, instead of designing a specialized network for movement synchrony estimation, previous transformer-based works broadly adopted architectures from other tasks such as human activity recognition. Therefore, this paper proposed a skeleton-based graph transformer for movement synchrony estimation. The proposed model applied ST-GCN, a spatial-temporal graph convolutional neural network for skeleton feature extraction, followed by a spatial transformer for spatial feature generation. The spatial transformer is guided by a uniquely designed joint position embedding shared between the same joints of interacting individuals. Besides, we incorporated a temporal similarity matrix in temporal attention computation considering the periodic intrinsic of body movements. In addition, the confidence score associated with each joint reflects the uncertainty of a pose, while previous works on movement synchrony estimation have not sufficiently emphasized this point. Since transformer networks demand a significant amount of data to train, we constructed a dataset for movement synchrony estimation using Human3.6M, a benchmark dataset for human activity recognition, and pretrained our model on it using contrastive learning. We further applied knowledge distillation to alleviate information loss introduced by pose detector failure in a privacy-preserving way. We compared our method with representative approaches on PT13, a dataset collected from autism therapy interventions. Our method achieved an overall accuracy of 88.98% and surpassed its counterparts by a wide margin while maintaining data privacy.Comment: Accepted by 24th ACM International Conference on Multimodal Interaction (ICMI'22). 17 pages, 2 figure

arXiv.org e-Print Archive

Recommended from our members

A Review of Techniques on Gait-Based Person Re-Identification

Author: Li M
Qi M
Rahi B
Publication venue: Australia Academic Press (Scilight Press)
Publication date: 27/03/2023
Field of study

Copyright (c) 2023 Babak Rahi, Maozhen Li and Man Qi. Person re-identification at a distance across multiple non-overlapping cameras has been an active research area for years. In the past ten years, short-term Person re-identification techniques have made great strides in accuracy using only appearance features in limited environments. However, massive intra-class variations and inter-class confusion limit their ability to be used in practical applications. Moreover, appearance consistency can only be assumed in a short time span from one camera to the other. Since the holistic appearance will change drastically over days and weeks, the technique, as mentioned above, will be ineffective. Practical applications usually require a long-term solution in which the subject's appearance and clothing might have changed after the elapse of a significant period. Facing these problems, soft biometric features such as Gait has stirred much interest in the past years. Nevertheless, even Gait can vary with illness, ageing and emotional states, walking surfaces, shoe types, clothes types, carried objects (by the subject) and even environment clutters. Therefore, Gait is considered as a temporal cue that could provide biometric motion information. On the other hand, the shape of the human body could be viewed as a spatial signal which can produce valuable information. So extracting discriminative features from both spatial and temporal domains would benefit this research. This article examines the main approaches used in gait analysis for re-identification over the past decade. We identify several relevant dimensions of the problem and provide a taxonomic analysis of current research. We conclude by reviewing the performance levels achievable with current technology and providing a perspective on the most challenging and promising research directions.This research received no external funding

Brunel University Research Archive

Transformer for Object Re-Identification: A Survey

Author: Chen Shuoyi
Crandall David
Du Bo
Li Chenyue
Ye Mang
Zheng Wei-Shi
Publication venue
Publication date: 12/01/2024
Field of study

Object Re-Identification (Re-ID) aims to identify and retrieve specific objects from varying viewpoints. For a prolonged period, this field has been predominantly driven by deep convolutional neural networks. In recent years, the Transformer has witnessed remarkable advancements in computer vision, prompting an increasing body of research to delve into the application of Transformer in Re-ID. This paper provides a comprehensive review and in-depth analysis of the Transformer-based Re-ID. In categorizing existing works into Image/Video-Based Re-ID, Re-ID with limited data/annotations, Cross-Modal Re-ID, and Special Re-ID Scenarios, we thoroughly elucidate the advantages demonstrated by the Transformer in addressing a multitude of challenges across these domains. Considering the trending unsupervised Re-ID, we propose a new Transformer baseline, UntransReID, achieving state-of-the-art performance on both single-/cross modal tasks. Besides, this survey also covers a wide range of Re-ID research objects, including progress in animal Re-ID. Given the diversity of species in animal Re-ID, we devise a standardized experimental benchmark and conduct extensive experiments to explore the applicability of Transformer for this task to facilitate future research. Finally, we discuss some important yet under-investigated open issues in the big foundation model era, we believe it will serve as a new handbook for researchers in this field

arXiv.org e-Print Archive

Novel Architecture for Human Re-Identification with a Two-Stream Neural Network and Attention Mechanism

Author: Qi Man
Rahi Babak
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 09/11/2022
Field of study

This paper proposes a novel architecture that utilises an attention mechanism in conjunction with multi-stream convolutional neural networks (CNN) to obtain high accuracy in human re-identification (Reid). The proposed architecture consists of four blocks. First, the pre-processing block prepares the input data and feeds it into a spatial-temporal two-stream CNN (STC) with two fusion points that extract the spatial-temporal features. Next, the spatial-temporal attentional LSTM block (STA) automatically fine-tunes the extracted features and assigns weight to the more critical frames in the video sequence by using an attention mechanism. Extensive experiments on four of the most popular datasets support our architecture. Finally, the results are compared with the state of the art, which shows the superiority of this approach

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)