Search CORE

14 research outputs found

Recommended from our members

View-invariant gait person re-identification with spatial and temporal attention

Author: Rahi Babak
Publication venue: Brunel University London
Publication date: 01/01/2021
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonPerson re-identification at a distance across multiple none overlapping cameras has been an active research area for years. In the past ten years, Short term Person Re-Id techniques have made great strides in terms of accuracy using only appearance features in limited environments. However, massive intraclass variations and inter-class confusion limit their ability to be used in practical applications. Moreover, appearance consistency can only be assumed in a short time span from one camera to the other. Since the holistic appearance will change drastically over days and weeks, the technique, as mentioned above, will be ineffective. Practical applications usually require a long-term solution in which the subject appearance and clothing might have changed after a significant period has elapsed. Facing these problems, soft biometric features such as Gait have been proposed in the past. Nevertheless, even Gait can vary with illness, ageing and changes in the emotional state, changes in walking surfaces, shoe type, clothes type, objects carried by the subject and even clutter in the scene. Therefore, Gait is considered a temporal cue that could provide biometric motion information. On the other hand, the shape of the human body could be viewed as a spatial signal which can produce valuable information. So, extracting discriminative features from both spatial and temporal domains would be very beneficial to this research. Therefore, this thesis focuses on finding the best and most robust method to tackle the gait human Re-identification problem and solve it for practical applications. In real-world surveillance scenarios, the human gait cycle is primarily abnormal. These abnormalities include but not limited to temporal and spatial characteristics changes such as walking speed, broken gait phase and most importantly, varied camera angles. Our work performed an extensive literature study on spatial and temporal gait feature extraction methods with a focus on deep learning. Next, we conducted a comparative study and proposed a spatial-temporal approach for gait feature extraction using the fusion of multiple modalities, including optical-flow, raw silhouettes and RGB images. This approach was tested on two of the most challenging publicly available datasets for gait recognition TUM-GAID and CASIA-B, with excellent results presented in chapter 3. Furthermore, a modern spatial-temporal attention mechanism was proposed and tested on CASIA-B and OULP datasets which learns salient features independent of the gait cycle and view variations. The spatial attention layer in the proposed method extracts the spatial feature maps using a two-layered architecture that are fused using late fusion. It can pay attention to the identity-related salient regions in silhouette sequences discriminatively using the spatial feature maps. The temporal attention layer consists of an LSTM that encodes the temporal motion for silhouette sequences. It uses the encoded output vectors in the temporal attention architecture to focus on the most critical timesteps in the gait cycle and discard the rest. Furthermore, we improved the performance of our method by mapping our extracted spatial-temporal gait features to a discriminative null space for use in our Siamese architecture for crossmatching. We also conducted an element removal experiment on each segment of our spatial-temporal attentional network to gain insight into each component’s contribution to the performance. Our method showed outstanding robustness against abnormal gait cycles as well as viewpoint variations on both benchmark datasets

Brunel University Research Archive

Inferring Facial and Body Language

Author: Shan Caifeng
Publication venue: 'Queen Mary University of London'
Publication date: 07/09/2016
Field of study

Machine analysis of human facial and body language is a challenging topic in computer vision, impacting on important applications such as human-computer interaction and visual surveillance. In this thesis, we present research building towards computational frameworks capable of automatically understanding facial expression and behavioural body language. The thesis work commences with a thorough examination in issues surrounding facial representation based on Local Binary Patterns (LBP). Extensive experiments with different machine learning techniques demonstrate that LBP features are efficient and effective for person-independent facial expression recognition, even in low-resolution settings. We then present and evaluate a conditional mutual information based algorithm to efficiently learn the most discriminative LBP features, and show the best recognition performance is obtained by using SVM classifiers with the selected LBP features. However, the recognition is performed on static images without exploiting temporal behaviors of facial expression. Subsequently we present a method to capture and represent temporal dynamics of facial expression by discovering the underlying low-dimensional manifold. Locality Preserving Projections (LPP) is exploited to learn the expression manifold in the LBP based appearance feature space. By deriving a universal discriminant expression subspace using a supervised LPP, we can effectively align manifolds of different subjects on a generalised expression manifold. Different linear subspace methods are comprehensively evaluated in expression subspace learning. We formulate and evaluate a Bayesian framework for dynamic facial expression recognition employing the derived manifold representation. However, the manifold representation only addresses temporal correlations of the whole face image, does not consider spatial-temporal correlations among different facial regions. We then employ Canonical Correlation Analysis (CCA) to capture correlations among face parts. To overcome the inherent limitations of classical CCA for image data, we introduce and formalise a novel Matrix-based CCA (MCCA), which can better measure correlations in 2D image data. We show this technique can provide superior performance in regression and recognition tasks, whilst requiring significantly fewer canonical factors. All the above work focuses on facial expressions. However, the face is usually perceived not as an isolated object but as an integrated part of the whole body, and the visual channel combining facial and bodily expressions is most informative. Finally we investigate two understudied problems in body language analysis, gait-based gender discrimination and affective body gesture recognition. To effectively combine face and body cues, CCA is adopted to establish the relationship between the two modalities, and derive a semantic joint feature space for the feature-level fusion. Experiments on large data sets demonstrate that our multimodal systems achieve the superior performance in gender discrimination and affective state analysis.Research studentship of Queen Mary, the International Travel Grant of the Royal Academy of Engineering, and the Royal Society International Joint Project

Intelligent Sensors for Human Motion Analysis

Author
Publication venue: 'MDPI AG'
Publication date: 25/10/2022
Field of study

The book, "Intelligent Sensors for Human Motion Analysis," contains 17 articles published in the Special Issue of the Sensors journal. These articles deal with many aspects related to the analysis of human movement. New techniques and methods for pose estimation, gait recognition, and fall detection have been proposed and verified. Some of them will trigger further research, and some may become the backbone of commercial systems