20 research outputs found

    Combining the Silhouette and Skeleton Data for Gait Recognition

    Full text link
    Gait recognition, a promising long-distance biometric technology, has aroused intense interest in computer vision. Existing works on gait recognition can be divided into appearance-based methods and model-based methods, which extract features from silhouettes and skeleton data, respectively. However, since appearance-based methods are greatly affected by clothing changing and carrying condition, and model-based methods are limited by the accuracy of pose estimation approaches, gait recognition remains challenging in practical applications. In order to integrate the advantages of such two approaches, a two-branch neural network (NN) is proposed in this paper. Our method contains two branches, namely a CNN-based branch taking silhouettes as input and a GCN-based branch taking skeletons as input. In addition, two new modules are proposed in the GCN-based branch for better gait representation. First, we present a simple yet effective fully connected graph convolution operator to integrate the multi-scale graph convolutions and alleviate the dependence on natural human joint connections. Second, we deploy a multi-dimension attention module named STC-Att to learn spatial, temporal and channel-wise attention simultaneously. We evaluated the proposed two-branch neural network on the CASIA-B dataset. The experimental results show that our method achieves state-of-the-art performance in various conditions.Comment: The paper is under consideration at Computer Vision and Image Understandin

    Condition-Adaptive Graph Convolution Learning for Skeleton-Based Gait Recognition

    Full text link
    Graph convolutional networks have been widely applied in skeleton-based gait recognition. A key challenge in this task is to distinguish the individual walking styles of different subjects across various views. Existing state-of-the-art methods employ uniform convolutions to extract features from diverse sequences and ignore the effects of viewpoint changes. To overcome these limitations, we propose a condition-adaptive graph (CAG) convolution network that can dynamically adapt to the specific attributes of each skeleton sequence and the corresponding view angle. In contrast to using fixed weights for all joints and sequences, we introduce a joint-specific filter learning (JSFL) module in the CAG method, which produces sequence-adaptive filters at the joint level. The adaptive filters capture fine-grained patterns that are unique to each joint, enabling the extraction of diverse spatial-temporal information about body parts. Additionally, we design a view-adaptive topology learning (VATL) module that generates adaptive graph topologies. These graph topologies are used to correlate the joints adaptively according to the specific view conditions. Thus, CAG can simultaneously adjust to various walking styles and viewpoints. Experiments on the two most widely used datasets (i.e., CASIA-B and OU-MVLP) show that CAG surpasses all previous skeleton-based methods. Moreover, the recognition performance can be enhanced by simply combining CAG with appearance-based methods, demonstrating the ability of CAG to provide useful complementary information.The source code will be available at https://github.com/OliverHxh/CAG.Comment: Accepted by TIP journa

    Reducing Training Demands for 3D Gait Recognition with Deep Koopman Operator Constraints

    Full text link
    Deep learning research has made many biometric recognition solution viable, but it requires vast training data to achieve real-world generalization. Unlike other biometric traits, such as face and ear, gait samples cannot be easily crawled from the web to form massive unconstrained datasets. As the human body has been extensively studied for different digital applications, one can rely on prior shape knowledge to overcome data scarcity. This work follows the recent trend of fitting a 3D deformable body model into gait videos using deep neural networks to obtain disentangled shape and pose representations for each frame. To enforce temporal consistency in the network, we introduce a new Linear Dynamical Systems (LDS) module and loss based on Koopman operator theory, which provides an unsupervised motion regularization for the periodic nature of gait, as well as a predictive capacity for extending gait sequences. We compare LDS to the traditional adversarial training approach and use the USF HumanID and CASIA-B datasets to show that LDS can obtain better accuracy with less training data. Finally, we also show that our 3D modeling approach is much better than other 3D gait approaches in overcoming viewpoint variation under normal, bag-carrying and clothing change conditions

    Multimodal Human Pose Feature Fusion for Gait Recognition.

    Get PDF
    Gait recognition allows identifying people at a distance based on the way they walk (i.e. gait) in a non-invasive approach. Most of the approaches published in the last decades are dominated by the use of silhouettes or other appearance-based modalities to describe the Gait cycle. In an attempt to exclude the appearance data, many works have been published that address the use of the human pose as a modality to describe the walking movement. However, as the pose contains less information when used as a single modality, the performance achieved by the models is generally poorer. To overcome such limitations, we propose a multimodal setup that combines multiple pose representation models. To this end, we evaluate multiple fusion strategies to aggregate the features derived from each pose modality at every model stage. Moreover, we introduce a weighted sum with trainable weights that can adaptively learn the optimal balance among pose modalities. Our experimental results show that (a) our fusion strategies can effectively combine different pose modalities by improving their baseline performance; and, (b) by using only human pose, our approach outperforms most of the silhouette-based state-of-the-art approaches. Concretely, we obtain 92.8% mean Top-1 accuracy in CASIA-B.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    GAIT Technology for Human Recognition using CNN

    Get PDF
    Gait is a distinctive biometric characteristic that can be detected from a distance; as a result, it has several uses in social security, forensic identification, and crime prevention. Existing gait identification techniques use a gait template, which makes it difficult to keep temporal information, or a gait sequence, which maintains pointless sequential limitations and loses the ability to portray a gait. Our technique, which is based on this deep set viewpoint, is immune to frame permutations and can seamlessly combine frames from many videos that were taken in various contexts, such as diversified watching, angles, various outfits, or various situations for transporting something. According to experiments, our single-model strategy obtains an average rank-1 accuracy of 96.1% on the CASIA-B gait dataset and an accuracy of 87.9% on the OU-MVLP gait dataset when used under typical walking conditions. Our model also demonstrates a great degree of robustness under numerous challenging circumstances. When carrying bags and wearing a coat while walking, it obtains accuracy on the CASIA-B of 90.8% and 70.3%, respectively, greatly surpassing the best approach currently in use. Additionally, the suggested method achieves a satisfactory level of accuracy even when there are few frames available in the test samples; for instance, it achieves 85.0% on the CASIA-B even with only 7 frames

    DyGait: Exploiting Dynamic Representations for High-performance Gait Recognition

    Full text link
    Gait recognition is a biometric technology that recognizes the identity of humans through their walking patterns. Compared with other biometric technologies, gait recognition is more difficult to disguise and can be applied to the condition of long-distance without the cooperation of subjects. Thus, it has unique potential and wide application for crime prevention and social security. At present, most gait recognition methods directly extract features from the video frames to establish representations. However, these architectures learn representations from different features equally but do not pay enough attention to dynamic features, which refers to a representation of dynamic parts of silhouettes over time (e.g. legs). Since dynamic parts of the human body are more informative than other parts (e.g. bags) during walking, in this paper, we propose a novel and high-performance framework named DyGait. This is the first framework on gait recognition that is designed to focus on the extraction of dynamic features. Specifically, to take full advantage of the dynamic information, we propose a Dynamic Augmentation Module (DAM), which can automatically establish spatial-temporal feature representations of the dynamic parts of the human body. The experimental results show that our DyGait network outperforms other state-of-the-art gait recognition methods. It achieves an average Rank-1 accuracy of 71.4% on the GREW dataset, 66.3% on the Gait3D dataset, 98.4% on the CASIA-B dataset and 98.3% on the OU-MVLP dataset

    GPGait: Generalized Pose-based Gait Recognition

    Full text link
    Recent works on pose-based gait recognition have demonstrated the potential of using such simple information to achieve results comparable to silhouette-based methods. However, the generalization ability of pose-based methods on different datasets is undesirably inferior to that of silhouette-based ones, which has received little attention but hinders the application of these methods in real-world scenarios. To improve the generalization ability of pose-based methods across datasets, we propose a \textbf{G}eneralized \textbf{P}ose-based \textbf{Gait} recognition (\textbf{GPGait}) framework. First, a Human-Oriented Transformation (HOT) and a series of Human-Oriented Descriptors (HOD) are proposed to obtain a unified pose representation with discriminative multi-features. Then, given the slight variations in the unified representation after HOT and HOD, it becomes crucial for the network to extract local-global relationships between the keypoints. To this end, a Part-Aware Graph Convolutional Network (PAGCN) is proposed to enable efficient graph partition and local-global spatial feature extraction. Experiments on four public gait recognition datasets, CASIA-B, OUMVLP-Pose, Gait3D and GREW, show that our model demonstrates better and more stable cross-domain capabilities compared to existing skeleton-based methods, achieving comparable recognition results to silhouette-based ones. Code is available at https://github.com/BNU-IVC/FastPoseGait.Comment: ICCV Camera Read

    GaitFormer: Revisiting Intrinsic Periodicity for Gait Recognition

    Full text link
    Gait recognition aims to distinguish different walking patterns by analyzing video-level human silhouettes, rather than relying on appearance information. Previous research on gait recognition has primarily focused on extracting local or global spatial-temporal representations, while overlooking the intrinsic periodic features of gait sequences, which, when fully utilized, can significantly enhance performance. In this work, we propose a plug-and-play strategy, called Temporal Periodic Alignment (TPA), which leverages the periodic nature and fine-grained temporal dependencies of gait patterns. The TPA strategy comprises two key components. The first component is Adaptive Fourier-transform Position Encoding (AFPE), which adaptively converts features and discrete-time signals into embeddings that are sensitive to periodic walking patterns. The second component is the Temporal Aggregation Module (TAM), which separates embeddings into trend and seasonal components, and extracts meaningful temporal correlations to identify primary components, while filtering out random noise. We present a simple and effective baseline method for gait recognition, based on the TPA strategy. Extensive experiments conducted on three popular public datasets (CASIA-B, OU-MVLP, and GREW) demonstrate that our proposed method achieves state-of-the-art performance on multiple benchmark tests

    Context-Sensitive Temporal Feature Learning for Gait Recognition

    Full text link
    Although gait recognition has drawn increasing research attention recently, it remains challenging to learn discriminative temporal representation, since the silhouette differences are quite subtle in spatial domain. Inspired by the observation that human can distinguish gaits of different subjects by adaptively focusing on temporal clips with different time scales, we propose a context-sensitive temporal feature learning (CSTL) network for gait recognition. CSTL produces temporal features in three scales, and adaptively aggregates them according to the contextual information from local and global perspectives. Specifically, CSTL contains an adaptive temporal aggregation module that subsequently performs local relation modeling and global relation modeling to fuse the multi-scale features. Besides, in order to remedy the spatial feature corruption caused by temporal operations, CSTL incorporates a salient spatial feature learning (SSFL) module to select groups of discriminative spatial features. Particularly, we utilize transformers to implement the global relation modeling and the SSFL module. To the best of our knowledge, this is the first work that adopts transformer in gait recognition. Extensive experiments conducted on three datasets demonstrate the state-of-the-art performance. Concretely, we achieve rank-1 accuracies of 98.7%, 96.2% and 88.7% under normal-walking, bag-carrying and coat-wearing conditions on CASIA-B, 97.5% on OU-MVLP and 50.6% on GREW.Comment: Submitted to TPAM
    corecore