3,704 research outputs found

    An original framework for understanding human actions and body language by using deep neural networks

    Get PDF
    The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

    Unsupervised Behaviour Analysis and Magnification (uBAM) using Deep Learning

    Full text link
    Motor behaviour analysis is essential to biomedical research and clinical diagnostics as it provides a non-invasive strategy for identifying motor impairment and its change caused by interventions. State-of-the-art instrumented movement analysis is time- and cost-intensive, since it requires placing physical or virtual markers. Besides the effort required for marking keypoints or annotations necessary for training or finetuning a detector, users need to know the interesting behaviour beforehand to provide meaningful keypoints. We introduce unsupervised behaviour analysis and magnification (uBAM), an automatic deep learning algorithm for analysing behaviour by discovering and magnifying deviations. A central aspect is unsupervised learning of posture and behaviour representations to enable an objective comparison of movement. Besides discovering and quantifying deviations in behaviour, we also propose a generative model for visually magnifying subtle behaviour differences directly in a video without requiring a detour via keypoints or annotations. Essential for this magnification of deviations even across different individuals is a disentangling of appearance and behaviour. Evaluations on rodents and human patients with neurological diseases demonstrate the wide applicability of our approach. Moreover, combining optogenetic stimulation with our unsupervised behaviour analysis shows its suitability as a non-invasive diagnostic tool correlating function to brain plasticity.Comment: Published in Nature Machine Intelligence (2021), https://rdcu.be/ch6p

    A psychometric measure of working memory capacity for configured body movement.

    Get PDF
    Working memory (WM) models have traditionally assumed at least two domain-specific storage systems for verbal and visuo-spatial information. We review data that suggest the existence of an additional slave system devoted to the temporary storage of body movements, and present a novel instrument for its assessment: the movement span task. The movement span task assesses individuals' ability to remember and reproduce meaningless configurations of the body. During the encoding phase of a trial, participants watch short videos of meaningless movements presented in sets varying in size from one to five items. Immediately after encoding, they are prompted to reenact as many items as possible. The movement span task was administered to 90 participants along with standard tests of verbal WM, visuo-spatial WM, and a gesture classification test in which participants judged whether a speaker's gestures were congruent or incongruent with his accompanying speech. Performance on the gesture classification task was not related to standard measures of verbal or visuo-spatial working memory capacity, but was predicted by scores on the movement span task. Results suggest the movement span task can serve as an assessment of individual differences in WM capacity for body-centric information
    • …
    corecore