7,145 research outputs found
An original framework for understanding human actions and body language by using deep neural networks
The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour.
By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way.
These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively.
While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements;
both are essential tasks in many computer vision applications, including event recognition, and video surveillance.
In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided.
The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements.
All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods
Recommended from our members
Low Complexity Radar Gesture Recognition Using Synthetic Training Data
Developments in radio detection and ranging (radar) technology have made hand gesture recognition feasible. In heat map-based gesture recognition, feature images have a large size and require complex neural networks to extract information. Machine learning methods typically require large amounts of data and collecting hand gestures with radar is time- and energy-consuming. Therefore, a low computational complexity algorithm for hand gesture recognition based on a frequency-modulated continuous-wave (FMCW) radar and a synthetic hand gesture feature generator are proposed. In the low computational complexity algorithm, two-dimensional Fast Fourier Transform is implemented on the radar raw data to generate a range-Doppler matrix. After that, background modelling is applied to separate the dynamic object and the static background. Then a bin with the highest magnitude in the range-Doppler matrix is selected to locate the target and obtain its range and velocity. The bins at this location along the dimension of the antenna can be utilised to calculate the angle of the target using Fourier beam steering. In the synthetic generator, the Blender software is used to generate different hand gestures and trajectories and then the range, velocity and angle of targets are extracted directly from the trajectory. The experimental results demonstrate that the average recognition accuracy of the model on the test set can reach 89.13% when the synthetic data are used as the training set and the real data are used as the test set. This indicates that the generation of synthetic data can make a meaningful contribution in the pre-training phase
Gesture Recognition Using Hidden Markov Models Augmented with Active Difference Signatures
With the recent invention of depth sensors, human gesture recognition has gained significant interest in the fields of computer vision and human computer interaction. Robust gesture recognition is a difficult problem because of the spatiotemporal variations in gesture formation, subject size, subject location, image fidelity, and subject occlusion. Gesture boundary detection, or the automatic detection of the onset and offset of a gesture in a sequence of gestures, is critical toward achieving robust gesture recognition. Existing gesture recognition methods perform the task of gesture segmentation either using resting frames in a gesture sequence or by using additional information such as audio, depth images, or RGB images. This ancillary information introduces high latency in gesture segmentation and recognition, thus making it inappropriate for real time applications. This thesis proposes a novel method to recognize time-varying human gestures from continuous video streams. The proposed method passes skeleton joint information into a Hidden Markov Model augmented with active difference signatures to achieve state-of-the-art gesture segmentation and recognition.
Active body parts are used to calculate the likelihood of previously unseen data to facilitate gesture segmentation. Active difference signatures are used to describe temporal motion as well as static differences from a canonical resting position. Geometric features, such as joint angles, and joint topological distances are used along with active difference signatures as salient feature descriptors. These feature descriptors serve as unique signatures which identify hidden states in a Hidden Markov Model. The Hidden Markov Model is able to identify gestures in a robust fashion which is tolerant to spatiotemporal and human-to-human variation in gesture articulation.
The proposed method is evaluated on both isolated and continuous datasets. An accuracy of 80.7% is achieved on the isolated MSR3D dataset and a mean Jaccard index of 0.58 is achieved on the continuous ChaLearn dataset. Results improve upon existing gesture recognition methods, which achieve a Jaccard index of 0.43 on the ChaLearn dataset. Comprehensive experiments investigate the feature selection, parameter optimization, and algorithmic methods to help understand the contributions of the proposed method
Multi-User Gesture Recognition with Radar Technology
The aim of this work is the development of a Radar system for consumer applications. It is capable of tracking multiple people in a room and offers a touchless human-machine interface for purposes that range from entertainment to hygiene
- …