25 research outputs found
Context-Sensitive Temporal Feature Learning for Gait Recognition
Although gait recognition has drawn increasing research attention recently,
it remains challenging to learn discriminative temporal representation, since
the silhouette differences are quite subtle in spatial domain. Inspired by the
observation that human can distinguish gaits of different subjects by
adaptively focusing on temporal clips with different time scales, we propose a
context-sensitive temporal feature learning (CSTL) network for gait
recognition. CSTL produces temporal features in three scales, and adaptively
aggregates them according to the contextual information from local and global
perspectives. Specifically, CSTL contains an adaptive temporal aggregation
module that subsequently performs local relation modeling and global relation
modeling to fuse the multi-scale features. Besides, in order to remedy the
spatial feature corruption caused by temporal operations, CSTL incorporates a
salient spatial feature learning (SSFL) module to select groups of
discriminative spatial features. Particularly, we utilize transformers to
implement the global relation modeling and the SSFL module. To the best of our
knowledge, this is the first work that adopts transformer in gait recognition.
Extensive experiments conducted on three datasets demonstrate the
state-of-the-art performance. Concretely, we achieve rank-1 accuracies of
98.7%, 96.2% and 88.7% under normal-walking, bag-carrying and coat-wearing
conditions on CASIA-B, 97.5% on OU-MVLP and 50.6% on GREW.Comment: Submitted to TPAM
GaitFormer: Revisiting Intrinsic Periodicity for Gait Recognition
Gait recognition aims to distinguish different walking patterns by analyzing
video-level human silhouettes, rather than relying on appearance information.
Previous research on gait recognition has primarily focused on extracting local
or global spatial-temporal representations, while overlooking the intrinsic
periodic features of gait sequences, which, when fully utilized, can
significantly enhance performance. In this work, we propose a plug-and-play
strategy, called Temporal Periodic Alignment (TPA), which leverages the
periodic nature and fine-grained temporal dependencies of gait patterns. The
TPA strategy comprises two key components. The first component is Adaptive
Fourier-transform Position Encoding (AFPE), which adaptively converts features
and discrete-time signals into embeddings that are sensitive to periodic
walking patterns. The second component is the Temporal Aggregation Module
(TAM), which separates embeddings into trend and seasonal components, and
extracts meaningful temporal correlations to identify primary components, while
filtering out random noise. We present a simple and effective baseline method
for gait recognition, based on the TPA strategy. Extensive experiments
conducted on three popular public datasets (CASIA-B, OU-MVLP, and GREW)
demonstrate that our proposed method achieves state-of-the-art performance on
multiple benchmark tests
GaitGS: Temporal Feature Learning in Granularity and Span Dimension for Gait Recognition
Gait recognition is an emerging biological recognition technology that
identifies and verifies individuals based on their walking patterns. However,
many current methods are limited in their use of temporal information. In order
to fully harness the potential of gait recognition, it is crucial to consider
temporal features at various granularities and spans. Hence, in this paper, we
propose a novel framework named GaitGS, which aggregates temporal features in
the granularity dimension and span dimension simultaneously. Specifically,
Multi-Granularity Feature Extractor (MGFE) is proposed to focus on capturing
the micro-motion and macro-motion information at the frame level and unit level
respectively. Moreover, we present Multi-Span Feature Learning (MSFL) module to
generate global and local temporal representations. On three popular gait
datasets, extensive experiments demonstrate the state-of-the-art performance of
our method. Our method achieves the Rank-1 accuracies of 92.9% (+0.5%), 52.0%
(+1.4%), and 97.5% (+0.8%) on CASIA-B, GREW, and OU-MVLP respectively. The
source code will be released soon.Comment: 14 pages, 6 figure
Robust gait recognition under variable covariate conditions
PhDGait is a weak biometric when compared to face, fingerprint or iris because it can be easily
affected by various conditions. These are known as the covariate conditions and include clothing,
carrying, speed, shoes and view among others. In the presence of variable covariate conditions
gait recognition is a hard problem yet to be solved with no working system reported.
In this thesis, a novel gait representation, the Gait Flow Image (GFI), is proposed to extract
more discriminative information from a gait sequence. GFI extracts the relative motion of body
parts in different directions in separate motion descriptors. Compared to the existing model-free
gait representations, GFI is more discriminative and robust to changes in covariate conditions.
In this thesis, gait recognition approaches are evaluated without the assumption on cooperative
subjects, i.e. both the gallery and the probe sets consist of gait sequences under different
and unknown covariate conditions. The results indicate that the performance of the existing approaches
drops drastically under this more realistic set-up. It is argued that selecting the gait
features which are invariant to changes in covariate conditions is the key to developing a gait
recognition system without subject cooperation. To this end, the Gait Entropy Image (GEnI) is
proposed to perform automatic feature selection on each pair of gallery and probe gait sequences.
Moreover, an Adaptive Component and Discriminant Analysis is formulated which seamlessly
integrates the feature selection method with subspace analysis for fast and robust recognition.
Among various factors that affect the performance of gait recognition, change in viewpoint
poses the biggest problem and is treated separately. A novel approach to address this problem is
proposed in this thesis by using Gait Flow Image in a cross view gait recognition framework with
the view angle of a probe gait sequence unknown. A Gaussian Process classification technique
is formulated to estimate the view angle of each probe gait sequence. To measure the similarity
of gait sequences across view angles, the correlation of gait sequences from different views is
modelled using Canonical Correlation Analysis and the correlation strength is used as a similarity
measure. This differs from existing approaches, which reconstruct gait features in different views
through 2D view transformation or 3D calibration. Without explicit reconstruction, the proposed
method can cope with feature mis-match across view and is more robust against feature noise
Clothing and carrying condition invariant gait recognition based on rotation forest
This paper proposes a gait recognition method which is invariant to maximum number of challenging factors of gait recognition mainly unpredictable variation in clothing and carrying conditions. The method introduces an averaged gait key-phase image (AGKI) which is computed by averaging each of the five key-phases of the gait periods of a gait sequence. It analyses the AGKIs using high-pass and low-pass Gaussian filters, each at three cut-off frequencies to achieve robustness against unpredictable variation in clothing and carrying conditions in addition to other covariate factors, e.g., walking speed, segmentation noise, shadows under feet and change in hair style and ground surface. The optimal cut-off frequencies of the Gaussian filters are determined based on an analysis of the focus values of filtered human subjectβs silhouettes. The method applies rotation forest ensemble learning recognition to enhance both individual accuracy and diversity within the ensemble for improved identification rate. Extensive experiments on public datasets demonstrate the efficacy of the proposed method
Combining the Silhouette and Skeleton Data for Gait Recognition
Gait recognition, a promising long-distance biometric technology, has aroused
intense interest in computer vision. Existing works on gait recognition can be
divided into appearance-based methods and model-based methods, which extract
features from silhouettes and skeleton data, respectively. However, since
appearance-based methods are greatly affected by clothing changing and carrying
condition, and model-based methods are limited by the accuracy of pose
estimation approaches, gait recognition remains challenging in practical
applications. In order to integrate the advantages of such two approaches, a
two-branch neural network (NN) is proposed in this paper. Our method contains
two branches, namely a CNN-based branch taking silhouettes as input and a
GCN-based branch taking skeletons as input. In addition, two new modules are
proposed in the GCN-based branch for better gait representation. First, we
present a simple yet effective fully connected graph convolution operator to
integrate the multi-scale graph convolutions and alleviate the dependence on
natural human joint connections. Second, we deploy a multi-dimension attention
module named STC-Att to learn spatial, temporal and channel-wise attention
simultaneously. We evaluated the proposed two-branch neural network on the
CASIA-B dataset. The experimental results show that our method achieves
state-of-the-art performance in various conditions.Comment: The paper is under consideration at Computer Vision and Image
Understandin
Uniscale and multiscale gait recognition in realistic scenario
The performance of a gait recognition method is affected by numerous challenging
factors that degrade its reliability as a behavioural biometrics for subject identification in
realistic scenario. Thus for effective visual surveillance, this thesis presents five gait recog-
nition methods that address various challenging factors to reliably identify a subject in
realistic scenario with low computational complexity. It presents a gait recognition method
that analyses spatio-temporal motion of a subject with statistical and physical parameters
using Procrustes shape analysis and elliptic Fourier descriptors (EFD). It introduces a part-
based EFD analysis to achieve invariance to carrying conditions, and the use of physical
parameters enables it to achieve invariance to across-day gait variation. Although spatio-
temporal deformation of a subjectβs shape in gait sequences provides better discriminative
power than its kinematics, inclusion of dynamical motion characteristics improves the iden-
tification rate. Therefore, the thesis presents a gait recognition method which combines
spatio-temporal shape and dynamic motion characteristics of a subject to achieve robust-
ness against the maximum number of challenging factors compared to related state-of-the-
art methods. A region-based gait recognition method that analyses a subjectβs shape in
image and feature spaces is presented to achieve invariance to clothing variation and carry-
ing conditions. To take into account of arbitrary moving directions of a subject in realistic
scenario, a gait recognition method must be robust against variation in view. Hence, the the-
sis presents a robust view-invariant multiscale gait recognition method. Finally, the thesis
proposes a gait recognition method based on low spatial and low temporal resolution video
sequences captured by a CCTV. The computational complexity of each method is analysed.
Experimental analyses on public datasets demonstrate the efficacy of the proposed methods
Recognizing complex faces and gaits via novel probabilistic models
In the field of computer vision, developing automated systems to recognize people
under unconstrained scenarios is a partially solved problem. In unconstrained sce-
narios a number of common variations and complexities such as occlusion, illumi-
nation, cluttered background and so on impose vast uncertainty to the recognition
process. Among the various biometrics that have been emerging recently, this
dissertation focus on two of them namely face and gait recognition.
Firstly we address the problem of recognizing faces with major occlusions amidst
other variations such as pose, scale, expression and illumination using a novel
PRObabilistic Component based Interpretation Model (PROCIM) inspired by key
psychophysical principles that are closely related to reasoning under uncertainty.
The model basically employs Bayesian Networks to establish, learn, interpret and
exploit intrinsic similarity mappings from the face domain. Then, by incorporating
e cient inference strategies, robust decisions are made for successfully recognizing
faces under uncertainty. PROCIM reports improved recognition rates over recent
approaches.
Secondly we address the newly upcoming gait recognition problem and show that
PROCIM can be easily adapted to the gait domain as well. We scienti cally
de ne and formulate sub-gaits and propose a novel modular training scheme to
e ciently learn subtle sub-gait characteristics from the gait domain. Our results
show that the proposed model is robust to several uncertainties and yields sig-
ni cant recognition performance. Apart from PROCIM, nally we show how a
simple component based gait reasoning can be coherently modeled using the re-
cently prominent Markov Logic Networks (MLNs) by intuitively fusing imaging,
logic and graphs.
We have discovered that face and gait domains exhibit interesting similarity map-
pings between object entities and their components. We have proposed intuitive
probabilistic methods to model these mappings to perform recognition under vari-
ous uncertainty elements. Extensive experimental validations justi es the robust-
ness of the proposed methods over the state-of-the-art techniques.
Covariate-invariant gait recognition using random subspace method and its extensions
Compared with other biometric traits like fingerprint or iris, the most significant advantage of gait is that it can be used for remote human identification without cooperation from the subjects. The technology of gait recognition may play an important role in crime prevention, law enforcement, etc. Yet the performance of automatic gait recognition may be affected by covariate factors such as speed, carrying condition, elapsed time, shoe, walking surface, clothing, camera viewpoint, video quality, etc. In this thesis, we propose a random subspace method (RSM) based classifier ensemble framework and its extensions for robust gait recognition.
Covariates change the human gait appearance in different ways. For example, speed may change the appearance of human arms or legs; camera viewpoint alters the human visual appearance in a global manner; carrying condition and clothing may change the appearance of any parts of the human body (depending on what is being carried/wore). Due to the unpredictable nature of covariates, it is difficult to collect all the representative training data. We claim overfitting may be the main problem that hampers the performance of gait recognition algorithms (that rely on learning). First, for speed-invariant gait recognition, we employ a basic RSM model, which can reduce the generalisation errors by combining a large number of weak classifiers in the decision level (i.e., by using majority voting).
We find that the performance of RSM decreases when the intra-class variations are large. In RSM, although weak classifiers with lower dimensionality tend to have better generalisation ability, they may have to contend with the underfitting problem if the dimensionality is too low. We thus enhance the RSM-based weak classifiers by extending RSM to multimodal-RSM. In tackling the elapsed time covariate, we use face information to enhance the RSM-based gait classifiers before the decision-level fusion. We find significant performance gain can be achieved when lower weight is assigned to the face information. We also employ a weak form of multimodal-RSM for gait recognition from low quality videos (with low resolution and low frame-rate) when other modalities are unavailable. In this case, model-based information is used to enhance the RSM-based weak classifiers. Then we point out the relationship of base classifier accuracy, classifier ensemble accuracy, and diversity among the base classifiers. By incorporating the model-based information (with lower weight) into the RSM-based weak classifiers, the diversity of the classifiers, which is positively correlated to the ensemble accuracy, can be enhanced.
In contrast to multimodal systems, large intra-class variations may have a significant impact on unimodal systems. We model the effect of various unknown covariates as a partial feature corruption problem with unknown locations in the spatial domain. By making some assumptions in ideal cases analysis, we provide the theoretical basis of RSM-based classifier ensemble in the application of covariate-invariant gait recognition. However, in real cases, these assumptions may not hold precisely, and the performance may be affected when the intra-class variations are large. We propose a criterion to address this issue. That is, in the decision-level fusion stage, for a query gait with unknown covariates, we need to dynamically suppress the ratio of the false votes and the true votes before the majority voting. Two strategies are employed, i.e., local enhancing (LE) which can increase true votes, and the proposed hybrid decision-level fusion (HDF) which can decrease false votes. Based on this criterion, the proposed RSM-based HDF (RSM-HDF) framework achieves very competitive performance in tackling the covariates such as walking surface, clothing, and elapsed time, which were deemed as the open questions.
The factor of camera viewpoint is different from other covariates. It alters the human appearance in a global manner. By employing unitary projection (UP), we form a new space, where the same subjects are closer from different views. However, it may also give rise to a large amount of feature distortions. We deem these distortions as the corrupted features with unknown locations in the new space (after UP), and use the RSM-HDF framework to address this issue. Robust view-invariant gait recognition can be achieved by using the UP-RSM-HDF framework.
In this thesis, we propose a RSM-based classifier ensemble framework and its extensions to realise the covariate-invariant gait recognition. It is less sensitive to most of the covariate factors such as speed, shoe, carrying condition, walking surface, video quality, clothing, elapsed time, camera viewpoint, etc., and it outperforms other state-of-the-art algorithms significantly on all the major public gait databases. Specifically, our method can achieve very competitive performance against (large changes in) view, clothing, walking surface, elapsed time, etc., which were deemed as the most difficult covariate factors