3,847 research outputs found
Gait recognition and understanding based on hierarchical temporal memory using 3D gait semantic folding
Gait recognition and understanding systems have shown a wide-ranging application prospect. However, their use of unstructured data from image and video has affected their performance, e.g., they are easily influenced by multi-views, occlusion, clothes, and object carrying conditions. This paper addresses these problems using a realistic 3-dimensional (3D) human structural data and sequential pattern learning framework with top-down attention modulating mechanism based on Hierarchical Temporal Memory (HTM). First, an accurate 2-dimensional (2D) to 3D human body pose and shape semantic parameters estimation method is proposed, which exploits the advantages of an instance-level body parsing model and a virtual dressing method. Second, by using gait semantic folding, the estimated body parameters are encoded using a sparse 2D matrix to construct the structural gait semantic image. In order to achieve time-based gait recognition, an HTM Network is constructed to obtain the sequence-level gait sparse distribution representations (SL-GSDRs). A top-down attention mechanism is introduced to deal with various conditions including multi-views by refining the SL-GSDRs, according to prior knowledge. The proposed gait learning model not only aids gait recognition tasks to overcome the difficulties in real application scenarios but also provides the structured gait semantic images for visual cognition. Experimental analyses on CMU MoBo, CASIA B, TUM-IITKGP, and KY4D datasets show a significant performance gain in terms of accuracy and robustness
Review of Person Re-identification Techniques
Person re-identification across different surveillance cameras with disjoint
fields of view has become one of the most interesting and challenging subjects
in the area of intelligent video surveillance. Although several methods have
been developed and proposed, certain limitations and unresolved issues remain.
In all of the existing re-identification approaches, feature vectors are
extracted from segmented still images or video frames. Different similarity or
dissimilarity measures have been applied to these vectors. Some methods have
used simple constant metrics, whereas others have utilised models to obtain
optimised metrics. Some have created models based on local colour or texture
information, and others have built models based on the gait of people. In
general, the main objective of all these approaches is to achieve a
higher-accuracy rate and lowercomputational costs. This study summarises
several developments in recent literature and discusses the various available
methods used in person re-identification. Specifically, their advantages and
disadvantages are mentioned and compared.Comment: Published 201
MSA-GCN:Multiscale Adaptive Graph Convolution Network for Gait Emotion Recognition
Gait emotion recognition plays a crucial role in the intelligent system. Most
of the existing methods recognize emotions by focusing on local actions over
time. However, they ignore that the effective distances of different emotions
in the time domain are different, and the local actions during walking are
quite similar. Thus, emotions should be represented by global states instead of
indirect local actions. To address these issues, a novel Multi Scale Adaptive
Graph Convolution Network (MSA-GCN) is presented in this work through
constructing dynamic temporal receptive fields and designing multiscale
information aggregation to recognize emotions. In our model, a adaptive
selective spatial-temporal graph convolution is designed to select the
convolution kernel dynamically to obtain the soft spatio-temporal features of
different emotions. Moreover, a Cross-Scale mapping Fusion Mechanism (CSFM) is
designed to construct an adaptive adjacency matrix to enhance information
interaction and reduce redundancy. Compared with previous state-of-the-art
methods, the proposed method achieves the best performance on two public
datasets, improving the mAP by 2\%. We also conduct extensive ablations studies
to show the effectiveness of different components in our methods
Combining the Silhouette and Skeleton Data for Gait Recognition
Gait recognition, a promising long-distance biometric technology, has aroused
intense interest in computer vision. Existing works on gait recognition can be
divided into appearance-based methods and model-based methods, which extract
features from silhouettes and skeleton data, respectively. However, since
appearance-based methods are greatly affected by clothing changing and carrying
condition, and model-based methods are limited by the accuracy of pose
estimation approaches, gait recognition remains challenging in practical
applications. In order to integrate the advantages of such two approaches, a
two-branch neural network (NN) is proposed in this paper. Our method contains
two branches, namely a CNN-based branch taking silhouettes as input and a
GCN-based branch taking skeletons as input. In addition, two new modules are
proposed in the GCN-based branch for better gait representation. First, we
present a simple yet effective fully connected graph convolution operator to
integrate the multi-scale graph convolutions and alleviate the dependence on
natural human joint connections. Second, we deploy a multi-dimension attention
module named STC-Att to learn spatial, temporal and channel-wise attention
simultaneously. We evaluated the proposed two-branch neural network on the
CASIA-B dataset. The experimental results show that our method achieves
state-of-the-art performance in various conditions.Comment: The paper is under consideration at Computer Vision and Image
Understandin
Person recognition based on deep gait: a survey.
Gait recognition, also known as walking pattern recognition, has expressed deep interest in the computer vision and biometrics community due to its potential to identify individuals from a distance. It has attracted increasing attention due to its potential applications and non-invasive nature. Since 2014, deep learning approaches have shown promising results in gait recognition by automatically extracting features. However, recognizing gait accurately is challenging due to the covariate factors, complexity and variability of environments, and human body representations. This paper provides a comprehensive overview of the advancements made in this field along with the challenges and limitations associated with deep learning methods. For that, it initially examines the various gait datasets used in the literature review and analyzes the performance of state-of-the-art techniques. After that, a taxonomy of deep learning methods is presented to characterize and organize the research landscape in this field. Furthermore, the taxonomy highlights the basic limitations of deep learning methods in the context of gait recognition. The paper is concluded by focusing on the present challenges and suggesting several research directions to improve the performance of gait recognition in the future
Learning to Estimate Critical Gait Parameters from Single-View RGB Videos with Transformer-Based Attention Network
Musculoskeletal diseases and cognitive impairments in patients lead to
difficulties in movement as well as negative effects on their psychological
health. Clinical gait analysis, a vital tool for early diagnosis and treatment,
traditionally relies on expensive optical motion capture systems. Recent
advances in computer vision and deep learning have opened the door to more
accessible and cost-effective alternatives. This paper introduces a novel
spatio-temporal Transformer network to estimate critical gait parameters from
RGB videos captured by a single-view camera. Empirical evaluations on a public
dataset of cerebral palsy patients indicate that the proposed framework
surpasses current state-of-the-art approaches and show significant improvements
in predicting general gait parameters (including Walking Speed, Gait Deviation
Index - GDI, and Knee Flexion Angle at Maximum Extension), while utilizing
fewer parameters and alleviating the need for manual feature extraction.Comment: Accepted at ISBI 2024 (21st IEEE International Symposium on
Biomedical Imaging
Condition-Adaptive Graph Convolution Learning for Skeleton-Based Gait Recognition
Graph convolutional networks have been widely applied in skeleton-based gait
recognition. A key challenge in this task is to distinguish the individual
walking styles of different subjects across various views. Existing
state-of-the-art methods employ uniform convolutions to extract features from
diverse sequences and ignore the effects of viewpoint changes. To overcome
these limitations, we propose a condition-adaptive graph (CAG) convolution
network that can dynamically adapt to the specific attributes of each skeleton
sequence and the corresponding view angle. In contrast to using fixed weights
for all joints and sequences, we introduce a joint-specific filter learning
(JSFL) module in the CAG method, which produces sequence-adaptive filters at
the joint level. The adaptive filters capture fine-grained patterns that are
unique to each joint, enabling the extraction of diverse spatial-temporal
information about body parts. Additionally, we design a view-adaptive topology
learning (VATL) module that generates adaptive graph topologies. These graph
topologies are used to correlate the joints adaptively according to the
specific view conditions. Thus, CAG can simultaneously adjust to various
walking styles and viewpoints. Experiments on the two most widely used datasets
(i.e., CASIA-B and OU-MVLP) show that CAG surpasses all previous skeleton-based
methods. Moreover, the recognition performance can be enhanced by simply
combining CAG with appearance-based methods, demonstrating the ability of CAG
to provide useful complementary information.The source code will be available
at https://github.com/OliverHxh/CAG.Comment: Accepted by TIP journa
GaitPT: Skeletons Are All You Need For Gait Recognition
The analysis of patterns of walking is an important area of research that has
numerous applications in security, healthcare, sports and human-computer
interaction. Lately, walking patterns have been regarded as a unique
fingerprinting method for automatic person identification at a distance. In
this work, we propose a novel gait recognition architecture called Gait Pyramid
Transformer (GaitPT) that leverages pose estimation skeletons to capture unique
walking patterns, without relying on appearance information. GaitPT adopts a
hierarchical transformer architecture that effectively extracts both spatial
and temporal features of movement in an anatomically consistent manner, guided
by the structure of the human skeleton. Our results show that GaitPT achieves
state-of-the-art performance compared to other skeleton-based gait recognition
works, in both controlled and in-the-wild scenarios. GaitPT obtains 82.6%
average accuracy on CASIA-B, surpassing other works by a margin of 6%.
Moreover, it obtains 52.16% Rank-1 accuracy on GREW, outperforming both
skeleton-based and appearance-based approaches
- …