16 research outputs found

    EMOPAIN Challenge 2020: Multimodal Pain Evaluation from Facial and Bodily Expressions.

    Get PDF
    The EmoPain 2020 Challenge is the first international competition aimed at creating a uniform platform for the comparison of machine learning and multimedia processing methods of automatic chronic pain assessment from human expressive behaviour, and also the identification of pain-related behaviours. The objective of the challenge is to promote research in the development of assistive technologies that help improve the quality of life for people with chronic pain via real-time monitoring and feedback to help manage their condition and remain physically active. The challenge also aims to encourage the use of the relatively underutilised, albeit vital bodily expression signals for automatic pain and pain-related emotion recognition. This paper presents a description of the challenge, competition guidelines, bench-marking dataset, and the baseline systems' architecture and performance on the three sub-tasks: pain estimation from facial expressions, pain recognition from multimodal movement, and protective movement behaviour detection.PSRC grant Emotion& Pain Project; NIHR Nottingham Biomedical Research Centr

    A curriculum learning approach for pain intensity recognition from facial expressions

    Get PDF

    Protective Behavior Detection in Chronic Pain Rehabilitation: From Data Preprocessing to Learning Model

    Get PDF
    Chronic pain (CP) rehabilitation extends beyond physiotherapist-directed clinical sessions and primarily functions in people's everyday lives. Unfortunately, self-directed rehabilitation is difficult because patients need to deal with both their pain and the mental barriers that pain imposes on routine functional activities. Physiotherapists adjust patients' exercise plans and advice in clinical sessions based on the amount of protective behavior (i.e., a sign of anxiety about movement) displayed by the patient. The goal of such modifications is to assist patients in overcoming their fears and maintaining physical functioning. Unfortunately, physiotherapists' support is absent during self-directed rehabilitation or also called self-management that people conduct in their daily life. To be effective, technology for chronic-pain self-management should be able to detect protective behavior to facilitate personalized support. Thereon, this thesis addresses the key challenges of ubiquitous automatic protective behavior detection (PBD). Our investigation takes advantage of an available dataset (EmoPain) containing movement and muscle activity data of healthy people and people with CP engaged in typical everyday activities. To begin, we examine the data augmentation methods and segmentation parameters using various vanilla neural networks in order to enable activity-independent PBD within pre-segmented activity instances. Second, by incorporating temporal and bodily attention mechanisms, we improve PBD performance and support theoretical/clinical understanding of protective behavior that the attention of a person with CP shifts between body parts perceived as risky during feared movements. Third, we use human activity recognition (HAR) to improve continuous PBD in data of various activity types. The approaches proposed above are validated against the ground truth established by majority voting from expert annotators. Unfortunately, using such majority-voted ground truth causes information loss, whereas direct learning from all annotators is vulnerable to noise from disagreements. As the final study, we improve the learning from multiple annotators by leveraging the agreement information for regularization

    Deep Multi Temporal Scale Networks for Human Motion Analysis

    Get PDF
    The movement of human beings appears to respond to a complex motor system that contains signals at different hierarchical levels. For example, an action such as ``grasping a glass on a table'' represents a high-level action, but to perform this task, the body needs several motor inputs that include the activation of different joints of the body (shoulder, arm, hand, fingers, etc.). Each of these different joints/muscles have a different size, responsiveness, and precision with a complex non-linearly stratified temporal dimension where every muscle has its temporal scale. Parts such as the fingers responds much faster to brain input than more voluminous body parts such as the shoulder. The cooperation we have when we perform an action produces smooth, effective, and expressive movement in a complex multiple temporal scale cognitive task. Following this layered structure, the human body can be described as a kinematic tree, consisting of joints connected. Although it is nowadays well known that human movement and its perception are characterised by multiple temporal scales, very few works in the literature are focused on studying this particular property. In this thesis, we will focus on the analysis of human movement using data-driven techniques. In particular, we will focus on the non-verbal aspects of human movement, with an emphasis on full-body movements. The data-driven methods can interpret the information in the data by searching for rules, associations or patterns that can represent the relationships between input (e.g. the human action acquired with sensors) and output (e.g. the type of action performed). Furthermore, these models may represent a new research frontier as they can analyse large masses of data and focus on aspects that even an expert user might miss. The literature on data-driven models proposes two families of methods that can process time series and human movement. The first family, called shallow models, extract features from the time series that can help the learning algorithm find associations in the data. These features are identified and designed by domain experts who can identify the best ones for the problem faced. On the other hand, the second family avoids this phase of extraction by the human expert since the models themselves can identify the best set of features to optimise the learning of the model. In this thesis, we will provide a method that can apply the multi-temporal scales property of the human motion domain to deep learning models, the only data-driven models that can be extended to handle this property. We will ask ourselves two questions: what happens if we apply knowledge about how human movements are performed to deep learning models? Can this knowledge improve current automatic recognition standards? In order to prove the validity of our study, we collected data and tested our hypothesis in specially designed experiments. Results support both the proposal and the need for the use of deep multi-scale models as a tool to better understand human movement and its multiple time-scale nature

    Multisensory Perception and Learning: Linking Pedagogy, Psychophysics, and Human–Computer Interaction

    Get PDF
    In this review, we discuss how specific sensory channels can mediate the learning of properties of the environment. In recent years, schools have increasingly been using multisensory technology for teaching. However, it still needs to be sufficiently grounded in neuroscientific and pedagogical evidence. Researchers have recently renewed understanding around the role of communication between sensory modalities during development. In the current review, we outline four principles that will aid technological development based on theoretical models of multisensory development and embodiment to foster in-depth, perceptual, and conceptual learning of mathematics. We also discuss how a multidisciplinary approach offers a unique contribution to development of new practical solutions for learning in school. Scientists, engineers, and pedagogical experts offer their interdisciplinary points of view on this topic. At the end of the review, we present our results, showing that one can use multiple sensory inputs and sensorimotor associations in multisensory technology to improve the discrimination of angles, but also possibly for educational purposes. Finally, we present an application, the ‘RobotAngle’ developed for primary (i.e., elementary) school children, which uses sounds and body movements to learn about angles

    Sensitive Pictures:Emotional Interpretation in the Museum

    Get PDF
    Museums are interested in designing emotional visitor experiences to complement traditional interpretations. HCI is interested in the relationship between Affective Computing and Affective Interaction. We describe Sensitive Pictures, an emotional visitor experience co-created with the Munch art museum. Visitors choose emotions, locate associated paintings in the museum, experience an emotional story while viewing them, and self-report their response. A subsequent interview with a portrayal of the artist employs computer vision to estimate emotional responses from facial expressions. Visitors are given a souvenir postcard visualizing their emotional data. A study of 132 members of the public (39 interviewed) illuminates key themes: designing emotional provocations; capturing emotional responses; engaging visitors with their data; a tendency for them to align their views with the system's interpretation; and integrating these elements into emotional trajectories. We consider how Affective Computing can hold up a mirror to our emotions during Affective Interaction.Comment: Accepted for publication in CHI 202

    MAFW: A Large-scale, Multi-modal, Compound Affective Database for Dynamic Facial Expression Recognition in the Wild

    Full text link
    Dynamic facial expression recognition (FER) databases provide important data support for affective computing and applications. However, most FER databases are annotated with several basic mutually exclusive emotional categories and contain only one modality, e.g., videos. The monotonous labels and modality cannot accurately imitate human emotions and fulfill applications in the real world. In this paper, we propose MAFW, a large-scale multi-modal compound affective database with 10,045 video-audio clips in the wild. Each clip is annotated with a compound emotional category and a couple of sentences that describe the subjects' affective behaviors in the clip. For the compound emotion annotation, each clip is categorized into one or more of the 11 widely-used emotions, i.e., anger, disgust, fear, happiness, neutral, sadness, surprise, contempt, anxiety, helplessness, and disappointment. To ensure high quality of the labels, we filter out the unreliable annotations by an Expectation Maximization (EM) algorithm, and then obtain 11 single-label emotion categories and 32 multi-label emotion categories. To the best of our knowledge, MAFW is the first in-the-wild multi-modal database annotated with compound emotion annotations and emotion-related captions. Additionally, we also propose a novel Transformer-based expression snippet feature learning method to recognize the compound emotions leveraging the expression-change relations among different emotions and modalities. Extensive experiments on MAFW database show the advantages of the proposed method over other state-of-the-art methods for both uni- and multi-modal FER. Our MAFW database is publicly available from https://mafw-database.github.io/MAFW.Comment: This paper has been accepted by ACM MM'2

    Bridging the gap between emotion and joint action

    Get PDF
    Our daily human life is filled with a myriad of joint action moments, be it children playing, adults working together (i.e., team sports), or strangers navigating through a crowd. Joint action brings individuals (and embodiment of their emotions) together, in space and in time. Yet little is known about how individual emotions propagate through embodied presence in a group, and how joint action changes individual emotion. In fact, the multi-agent component is largely missing from neuroscience-based approaches to emotion, and reversely joint action research has not found a way yet to include emotion as one of the key parameters to model socio-motor interaction. In this review, we first identify the gap and then stockpile evidence showing strong entanglement between emotion and acting together from various branches of sciences. We propose an integrative approach to bridge the gap, highlight five research avenues to do so in behavioral neuroscience and digital sciences, and address some of the key challenges in the area faced by modern societies

    Modelling person-specific and multi-scale facial dynamics for automatic personality and depression analysis

    Get PDF
    ‘To know oneself is true progress’. While one's identity is difficult to be fully described, a key part of it is one’s personality. Accurately understanding personality can benefit various aspects of human's life. There is convergent evidence suggesting that personality traits are marked by non-verbal facial expressions of emotions, which in theory means that automatic personality assessment is possible from facial behaviours. Thus, this thesis aims to develop video-based automatic personality analysis approaches. Specifically, two video-level dynamic facial behaviour representations are proposed for automatic personality traits estimation, namely person-specific representation and spectral representation, which focus on addressing three issues that have been frequently occurred in existing automatic personality analysis approaches: 1. attempting to use super short video segments or even a single frame to infer personality traits; 2. lack of proper way to retain multi-scale long-term temporal information; 3. lack of methods to encode person-specific facial dynamics that are relatively stable over time but differ across individuals. This thesis starts with extending the dynamic image algorithm to modeling preceding and succeeding short-term face dynamics of each frame in a video, which achieved good performance in estimating valence/arousal intensities, showing good dynamic encoding ability of such dynamic representation. This thesis then proposes a novel Rank Loss, aiming to train a network that produces similar dynamic representation per-frame but only from a still image. This way, the network can learn generic facial dynamics from unlabelled face videos in a self-supervised manner. Based on such an approach, the person-specific representation encoding approach is proposed. It firstly freezes the well-trained generic network, and incorporates a set of intermediate filters, which are trained again but with only person-specific videos based on the same self-supervised learning approach. As a result, the learned filters' weights are person-specific, and can be concatenated as a 1-D video-level person-specific representation. Meanwhile, this thesis also proposes a spectral analysis approach to retain multi-scale video-level facial dynamics. This approach uses automatically detected human behaviour primitives as the low-dimensional descriptor for each frame, and converts long and variable-length time-series behaviour signals to small and length-independent spectral representations to represent video-level multi-scale temporal dynamics of expressive behaviours. Consequently, the combination of two representations, which contains not only multi-scale video-level facial dynamics but also person-specific video-level facial dynamics, can be applied to automatic personality estimation. This thesis conducts a series of experiments to validate the proposed approaches: 1. the arousal/valence intensity estimation is conducted on both a controlled face video dataset (SEMAINE) and a wild face video dataset (Affwild-2), to evaluate the dynamic encoding capability of the proposed Rank Loss; 2. the proposed automatic personality traits recognition systems (spectral representation and person-specific representation) are evaluated on face video datasets that labelled with either 'Big-Five' apparent personality traits (ChaLearn) or self-reported personality traits (VHQ); 3. the depression studies are also evaluated on the VHQ dataset that is labelled with PHQ-9 depression scores. The experimental results on automatic personality traits and depression severity estimation tasks show the person-specific representation's good performance in personality task and spectral vector's superior performance in depression task. In particular, the proposed person-specific approach achieved a similar performance to the state-of-the-art method in apparent personality traits recognition task and achieved at least 15% PCC improvements over other approaches in self-reported personality traits recognition task. Meanwhile, the proposed spectral representation shows better performance than the person-specific approach in depression severity estimation task. In addition, this thesis also found that adding personality traits labels/predictions into behaviour descriptors improved depression severity estimation results
    corecore