18,368 research outputs found

    Learning Bodily and Temporal Attention in Protective Movement Behavior Detection

    Get PDF
    For people with chronic pain, the assessment of protective behavior during physical functioning is essential to understand their subjective pain-related experiences (e.g., fear and anxiety toward pain and injury) and how they deal with such experiences (avoidance or reliance on specific body joints), with the ultimate goal of guiding intervention. Advances in deep learning (DL) can enable the development of such intervention. Using the EmoPain MoCap dataset, we investigate how attention-based DL architectures can be used to improve the detection of protective behavior by capturing the most informative temporal and body configurational cues characterizing specific movements and the strategies used to perform them. We propose an end-to-end deep learning architecture named BodyAttentionNet (BANet). BANet is designed to learn temporal and bodily parts that are more informative to the detection of protective behavior. The approach addresses the variety of ways people execute a movement (including healthy people) independently of the type of movement analyzed. Through extensive comparison experiments with other state-of-the-art machine learning techniques used with motion capture data, we show statistically significant improvements achieved by using these attention mechanisms. In addition, the BANet architecture requires a much lower number of parameters than the state of the art for comparable if not higher performances.Comment: 7 pages, 3 figures, 2 tables, code available, accepted in ACII 201

    Gait recognition based on shape and motion analysis of silhouette contours

    Get PDF
    This paper presents a three-phase gait recognition method that analyses the spatio-temporal shape and dynamic motion (STS-DM) characteristics of a human subject’s silhouettes to identify the subject in the presence of most of the challenging factors that affect existing gait recognition systems. In phase 1, phase-weighted magnitude spectra of the Fourier descriptor of the silhouette contours at ten phases of a gait period are used to analyse the spatio-temporal changes of the subject’s shape. A component-based Fourier descriptor based on anatomical studies of human body is used to achieve robustness against shape variations caused by all common types of small carrying conditions with folded hands, at the subject’s back and in upright position. In phase 2, a full-body shape and motion analysis is performed by fitting ellipses to contour segments of ten phases of a gait period and using a histogram matching with Bhattacharyya distance of parameters of the ellipses as dissimilarity scores. In phase 3, dynamic time warping is used to analyse the angular rotation pattern of the subject’s leading knee with a consideration of arm-swing over a gait period to achieve identification that is invariant to walking speed, limited clothing variations, hair style changes and shadows under feet. The match scores generated in the three phases are fused using weight-based score-level fusion for robust identification in the presence of missing and distorted frames, and occlusion in the scene. Experimental analyses on various publicly available data sets show that STS-DM outperforms several state-of-the-art gait recognition methods

    Chronic-Pain Protective Behavior Detection with Deep Learning

    Get PDF
    In chronic pain rehabilitation, physiotherapists adapt physical activity to patients' performance based on their expression of protective behavior, gradually exposing them to feared but harmless and essential everyday activities. As rehabilitation moves outside the clinic, technology should automatically detect such behavior to provide similar support. Previous works have shown the feasibility of automatic protective behavior detection (PBD) within a specific activity. In this paper, we investigate the use of deep learning for PBD across activity types, using wearable motion capture and surface electromyography data collected from healthy participants and people with chronic pain. We approach the problem by continuously detecting protective behavior within an activity rather than estimating its overall presence. The best performance reaches mean F1 score of 0.82 with leave-one-subject-out cross validation. When protective behavior is modelled per activity type, performance is mean F1 score of 0.77 for bend-down, 0.81 for one-leg-stand, 0.72 for sit-to-stand, 0.83 for stand-to-sit, and 0.67 for reach-forward. This performance reaches excellent level of agreement with the average experts' rating performance suggesting potential for personalized chronic pain management at home. We analyze various parameters characterizing our approach to understand how the results could generalize to other PBD datasets and different levels of ground truth granularity.Comment: 24 pages, 12 figures, 7 tables. Accepted by ACM Transactions on Computing for Healthcar

    An original framework for understanding human actions and body language by using deep neural networks

    Get PDF
    The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

    A dynamic texture based approach to recognition of facial actions and their temporal models

    Get PDF
    In this work, we propose a dynamic texture-based approach to the recognition of facial Action Units (AUs, atomic facial gestures) and their temporal models (i.e., sequences of temporal segments: neutral, onset, apex, and offset) in near-frontal-view face videos. Two approaches to modeling the dynamics and the appearance in the face region of an input video are compared: an extended version of Motion History Images and a novel method based on Nonrigid Registration using Free-Form Deformations (FFDs). The extracted motion representation is used to derive motion orientation histogram descriptors in both the spatial and temporal domain. Per AU, a combination of discriminative, frame-based GentleBoost ensemble learners and dynamic, generative Hidden Markov Models detects the presence of the AU in question and its temporal segments in an input image sequence. When tested for recognition of all 27 lower and upper face AUs, occurring alone or in combination in 264 sequences from the MMI facial expression database, the proposed method achieved an average event recognition accuracy of 89.2 percent for the MHI method and 94.3 percent for the FFD method. The generalization performance of the FFD method has been tested using the Cohn-Kanade database. Finally, we also explored the performance on spontaneous expressions in the Sensitive Artificial Listener data set

    Chapter From the Lab to the Real World: Affect Recognition Using Multiple Cues and Modalities

    Get PDF
    Interdisciplinary concept of dissipative soliton is unfolded in connection with ultrafast fibre lasers. The different mode-locking techniques as well as experimental realizations of dissipative soliton fibre lasers are surveyed briefly with an emphasis on their energy scalability. Basic topics of the dissipative soliton theory are elucidated in connection with concepts of energy scalability and stability. It is shown that the parametric space of dissipative soliton has reduced dimension and comparatively simple structure that simplifies the analysis and optimization of ultrafast fibre lasers. The main destabilization scenarios are described and the limits of energy scalability are connected with impact of optical turbulence and stimulated Raman scattering. The fast and slow dynamics of vector dissipative solitons are exposed

    Timing is everything: A spatio-temporal approach to the analysis of facial actions

    No full text
    This thesis presents a fully automatic facial expression analysis system based on the Facial Action Coding System (FACS). FACS is the best known and the most commonly used system to describe facial activity in terms of facial muscle actions (i.e., action units, AUs). We will present our research on the analysis of the morphological, spatio-temporal and behavioural aspects of facial expressions. In contrast with most other researchers in the field who use appearance based techniques, we use a geometric feature based approach. We will argue that that approach is more suitable for analysing facial expression temporal dynamics. Our system is capable of explicitly exploring the temporal aspects of facial expressions from an input colour video in terms of their onset (start), apex (peak) and offset (end). The fully automatic system presented here detects 20 facial points in the first frame and tracks them throughout the video. From the tracked points we compute geometry-based features which serve as the input to the remainder of our systems. The AU activation detection system uses GentleBoost feature selection and a Support Vector Machine (SVM) classifier to find which AUs were present in an expression. Temporal dynamics of active AUs are recognised by a hybrid GentleBoost-SVM-Hidden Markov model classifier. The system is capable of analysing 23 out of 27 existing AUs with high accuracy. The main contributions of the work presented in this thesis are the following: we have created a method for fully automatic AU analysis with state-of-the-art recognition results. We have proposed for the first time a method for recognition of the four temporal phases of an AU. We have build the largest comprehensive database of facial expressions to date. We also present for the first time in the literature two studies for automatic distinction between posed and spontaneous expressions
    • …
    corecore