12 research outputs found
The AVA Multi-View Dataset for Gait Recognition
In this paper, we introduce a new multi-view dataset for
gait recognition. The dataset was recorded in an indoor scenario, using
six convergent cameras setup to produce multi-view videos, where each
video depicts a walking human. Each sequence contains at least 3 complete
gait cycles. The dataset contains videos of 20 walking persons with
a large variety of body size, who walk along straight and curved paths.
The multi-view videos have been processed to produce foreground silhouettes.
To validate our dataset, we have extended some appearance-based
2D gait recognition methods to work with 3D data, obtaining very encouraging
results. The dataset, as well as camera calibration information,
is freely available for research purpose
Sparse error gait image: a new representation for gait recognition
The performance of a gait recognition system is very much related to the usage of efficient feature representation and recognition modules. The first extracts features from an input image sequence to represent a user's distinctive gait pattern. The recognition module then compares the features of a probe user with those registered in the gallery database. This paper presents a novel gait feature representation, called Sparse Error Gait Image (SEGI), derived from the application of Robust Principal Component Analysis (RPCA) to Gait Energy Images (GEI). GEIs obtained from the same user at different instants always present some differences. Applying RPCA results in low-rank and sparse error components, the former capturing the commonalities and encompassing the small differences between input GEIs, while the larger differences are captured by the sparse error component. The proposed SEGI representation exploits the latter for recognition purposes. This paper also proposes two simple approaches for the recognition module, to exploit the SEGI, based on the computation of a Euclidean norm or the Euclidean distance. Using these simple recognition methods and the proposed SEGI representation gait recognition, results equivalent to the state-of-the-art are obtained
MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition
Gait recognition, which aims at identifying individuals by their walking
patterns, has recently drawn increasing research attention. However, gait
recognition still suffers from the conflicts between the limited binary visual
clues of the silhouette and numerous covariates with diverse scales, which
brings challenges to the model's adaptiveness. In this paper, we address this
conflict by developing a novel MetaGait that learns to learn an omni sample
adaptive representation. Towards this goal, MetaGait injects meta-knowledge,
which could guide the model to perceive sample-specific properties, into the
calibration network of the attention mechanism to improve the adaptiveness from
the omni-scale, omni-dimension, and omni-process perspectives. Specifically, we
leverage the meta-knowledge across the entire process, where Meta Triple
Attention and Meta Temporal Pooling are presented respectively to adaptively
capture omni-scale dependency from spatial/channel/temporal dimensions
simultaneously and to adaptively aggregate temporal information through
integrating the merits of three complementary temporal aggregation methods.
Extensive experiments demonstrate the state-of-the-art performance of the
proposed MetaGait. On CASIA-B, we achieve rank-1 accuracy of 98.7%, 96.0%, and
89.3% under three conditions, respectively. On OU-MVLP, we achieve rank-1
accuracy of 92.4%.Comment: Accepted by ECCV202
Learning Gait Representation from Massive Unlabelled Walking Videos: A Benchmark
Gait depicts individuals' unique and distinguishing walking patterns and has
become one of the most promising biometric features for human identification.
As a fine-grained recognition task, gait recognition is easily affected by many
factors and usually requires a large amount of completely annotated data that
is costly and insatiable. This paper proposes a large-scale self-supervised
benchmark for gait recognition with contrastive learning, aiming to learn the
general gait representation from massive unlabelled walking videos for
practical applications via offering informative walking priors and diverse
real-world variations. Specifically, we collect a large-scale unlabelled gait
dataset GaitLU-1M consisting of 1.02M walking sequences and propose a
conceptually simple yet empirically powerful baseline model GaitSSB.
Experimentally, we evaluate the pre-trained model on four widely-used gait
benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer
learning. The unsupervised results are comparable to or even better than the
early model-based and GEI-based methods. After transfer learning, our method
outperforms existing methods by a large margin in most cases. Theoretically, we
discuss the critical issues for gait-specific contrastive framework and present
some insights for further study. As far as we know, GaitLU-1M is the first
large-scale unlabelled gait dataset, and GaitSSB is the first method that
achieves remarkable unsupervised results on the aforementioned benchmarks. The
source code of GaitSSB will be integrated into OpenGait which is available at
https://github.com/ShiqiYu/OpenGait
Grassmann Learning for Recognition and Classification
Computational performance associated with high-dimensional data is a common challenge for real-world classification and recognition systems. Subspace learning has received considerable attention as a means of finding an efficient low-dimensional representation that leads to better classification and efficient processing. A Grassmann manifold is a space that promotes smooth surfaces, where points represent subspaces and the relationship between points is defined by a mapping of an orthogonal matrix. Grassmann learning involves embedding high dimensional subspaces and kernelizing the embedding onto a projection space where distance computations can be effectively performed. In this dissertation, Grassmann learning and its benefits towards action classification and face recognition in terms of accuracy and performance are investigated and evaluated. Grassmannian Sparse Representation (GSR) and Grassmannian Spectral Regression (GRASP) are proposed as Grassmann inspired subspace learning algorithms. GSR is a novel subspace learning algorithm that combines the benefits of Grassmann manifolds with sparse representations using least squares loss §¤1-norm minimization for improved classification. GRASP is a novel subspace learning algorithm that leverages the benefits of Grassmann manifolds and Spectral Regression in a framework that supports high discrimination between classes and achieves computational benefits by using manifold modeling and avoiding eigen-decomposition. The effectiveness of GSR and GRASP is demonstrated for computationally intensive classification problems: (a) multi-view action classification using the IXMAS Multi-View dataset, the i3DPost Multi-View dataset, and the WVU Multi-View dataset, (b) 3D action classification using the MSRAction3D dataset and MSRGesture3D dataset, and (c) face recognition using the ATT Face Database, Labeled Faces in the Wild (LFW), and the Extended Yale Face Database B (YALE). Additional contributions include the definition of Motion History Surfaces (MHS) and Motion Depth Surfaces (MDS) as descriptors suitable for activity representations in video sequences and 3D depth sequences. An in-depth analysis of Grassmann metrics is applied on high dimensional data with different levels of noise and data distributions which reveals that standardized Grassmann kernels are favorable over geodesic metrics on a Grassmann manifold. Finally, an extensive performance analysis is made that supports Grassmann subspace learning as an effective approach for classification and recognition
Robust gait recognition under variable covariate conditions
PhDGait is a weak biometric when compared to face, fingerprint or iris because it can be easily
affected by various conditions. These are known as the covariate conditions and include clothing,
carrying, speed, shoes and view among others. In the presence of variable covariate conditions
gait recognition is a hard problem yet to be solved with no working system reported.
In this thesis, a novel gait representation, the Gait Flow Image (GFI), is proposed to extract
more discriminative information from a gait sequence. GFI extracts the relative motion of body
parts in different directions in separate motion descriptors. Compared to the existing model-free
gait representations, GFI is more discriminative and robust to changes in covariate conditions.
In this thesis, gait recognition approaches are evaluated without the assumption on cooperative
subjects, i.e. both the gallery and the probe sets consist of gait sequences under different
and unknown covariate conditions. The results indicate that the performance of the existing approaches
drops drastically under this more realistic set-up. It is argued that selecting the gait
features which are invariant to changes in covariate conditions is the key to developing a gait
recognition system without subject cooperation. To this end, the Gait Entropy Image (GEnI) is
proposed to perform automatic feature selection on each pair of gallery and probe gait sequences.
Moreover, an Adaptive Component and Discriminant Analysis is formulated which seamlessly
integrates the feature selection method with subspace analysis for fast and robust recognition.
Among various factors that affect the performance of gait recognition, change in viewpoint
poses the biggest problem and is treated separately. A novel approach to address this problem is
proposed in this thesis by using Gait Flow Image in a cross view gait recognition framework with
the view angle of a probe gait sequence unknown. A Gaussian Process classification technique
is formulated to estimate the view angle of each probe gait sequence. To measure the similarity
of gait sequences across view angles, the correlation of gait sequences from different views is
modelled using Canonical Correlation Analysis and the correlation strength is used as a similarity
measure. This differs from existing approaches, which reconstruct gait features in different views
through 2D view transformation or 3D calibration. Without explicit reconstruction, the proposed
method can cope with feature mis-match across view and is more robust against feature noise
Deep Generative Modelling of Human Behaviour
Human action is naturally intelligible as a time-varying graph of connected joints constrained by locomotor anatomy and physiology. Its prediction allows the anticipation of actions with applications across healthcare, physical rehabilitation and training, robotics, navigation, manufacture, entertainment, and security. In this thesis we investigate deep generative approaches to the problem of understanding human action. We show that the learning of generative qualities of the distribution may render discriminative tasks more robust to distributional shift and real-world variations in data quality. We further build, from the bottom-up, a novel stochastically deep generative modelling model taylored to the problem of human motion and demonstrate many of it’s state-of-the-art properties such as anomaly detection, imputation in the face of incomplete examples, as well as synthesis—and conditional synthesis—of new samples on massive open source human motion datasets compared to multiple baselines derived from the most relevant pieces of literature