70 research outputs found

    A dataset of continuous affect annotations and physiological signals for emotion analysis

    Get PDF
    From a computational viewpoint, emotions continue to be intriguingly hard to understand. In research, direct, real-time inspection in realistic settings is not possible. Discrete, indirect, post-hoc recordings are therefore the norm. As a result, proper emotion assessment remains a problematic issue. The Continuously Annotated Signals of Emotion (CASE) dataset provides a solution as it focusses on real-time continuous annotation of emotions, as experienced by the participants, while watching various videos. For this purpose, a novel, intuitive joystick-based annotation interface was developed, that allowed for simultaneous reporting of valence and arousal, that are instead often annotated independently. In parallel, eight high quality, synchronized physiological recordings (1000 Hz, 16-bit ADC) were made of ECG, BVP, EMG (3x), GSR (or EDA), respiration and skin temperature. The dataset consists of the physiological and annotation data from 30 participants, 15 male and 15 female, who watched several validated video-stimuli. The validity of the emotion induction, as exemplified by the annotation and physiological data, is also presented.Comment: Dataset available at: https://rmc.dlr.de/download/CASE_dataset/CASE_dataset.zi

    Temporal Dynamics in the Interpersonal Behaviour and Perceptions of Romantic Partners

    Get PDF
    The current study investigates romantic partners’ perceptions of their own and their partner’s interpersonal behaviour as they unfold continuously over the course of an interaction. In particular, we examine the types of biases that affect partners’ perceptions of themselves and each other. Using a joystick-based assessment program, 59 romantic couples rated moment-to moment fluctuations in dominance and affiliation in themselves and their partners as they watched a video of their own behaviour in a previous, relatively unstructured collaborative task. To provide a benchmark of comparison, independent observers applied the same coding technique to provide continuous ratings of behaviour for each participant. Partners also completed several paper-and-pencil measures, which tapped their levels of trait dominance and affiliation, as well as other characteristics of each person and their relationship. Joystick data for both participants and observers were used to calculate overall (mean) and moment-to-moment indices of dominance and affiliation. Indices were also derived across partners for overall and moment-to-moment reciprocity and correspondence. Findings for overall levels of dominance and affiliation suggest that partners’ perceptions reflected several types of bias. For instance, apart from the presence of the well-established positivity and projection biases, our results for perceptions of one’s partner’s dominance revealed a presumed reciprocity bias in which the partner was perceived to have behaved in a manner more opposite to one’s own overall level of dominance than was actually true. Nonetheless, we found that such biases occurred in the context of considerable accurate understanding in perceptions of oneself and one’s partner. At the moment-to-moment level, findings revealed that partners were able to reliably track interpersonal behaviour for both targets as it was unfolding continuously across time. However, large individual differences in partners’ moment-to-moment tracking ability emerged. Finally, we demonstrated that partners’ interpersonal behaviours and perceptions were related to relationship satisfaction in important ways. Specifically, partners’ overall affiliation during the task and the extent to which their affiliation levels matched each other were related positively to relationship satisfaction. In addition, partners who were able to pass control skillfully throughout the course of the interaction, a phenomenon termed reciprocity, reported being more satisfied. Similarly, couples with greater ability to track moment-to-moment fluctuations in their levels of dominance also reported higher relationship satisfaction

    Developing an Affect-Aware Rear-Projected Robotic Agent

    Get PDF
    Social (or Sociable) robots are designed to interact with people in a natural and interpersonal manner. They are becoming an integrated part of our daily lives and have achieved positive outcomes in several applications such as education, health care, quality of life, entertainment, etc. Despite significant progress towards the development of realistic social robotic agents, a number of problems remain to be solved. First, current social robots either lack enough ability to have deep social interaction with human, or they are very expensive to build and maintain. Second, current social robots have yet to reach the full emotional and social capabilities necessary for rich and robust interaction with human beings. To address these problems, this dissertation presents the development of a low-cost, flexible, affect-aware rear-projected robotic agent (called ExpressionBot), that is designed to support verbal and non-verbal communication between the robot and humans, with the goal of closely modeling the dynamics of natural face-to-face communication. The developed robotic platform uses state-of-the-art character animation technologies to create an animated human face (aka avatar) that is capable of showing facial expressions, realistic eye movement, and accurate visual speech, and then project this avatar onto a face-shaped translucent mask. The mask and the projector are then rigged onto a neck mechanism that can move like a human head. Since an animation is projected onto a mask, the robotic face is highly flexible research tool, mechanically simple, and low-cost to design, build and maintain compared with mechatronic and android faces. The results of our comprehensive Human-Robot Interaction (HRI) studies illustrate the benefits and values of the proposed rear-projected robotic platform over a virtual-agent with the same animation displayed on a 2D computer screen. The results indicate that ExpressionBot is well accepted by users, with some advantages in expressing facial expressions more accurately and perceiving mutual eye gaze contact. To improve social capabilities of the robot and create an expressive and empathic social agent (affect-aware) which is capable of interpreting users\u27 emotional facial expressions, we developed a new Deep Neural Networks (DNN) architecture for Facial Expression Recognition (FER). The proposed DNN was initially trained on seven well-known publicly available databases, and obtained significantly better than, or comparable to, traditional convolutional neural networks or other state-of-the-art methods in both accuracy and learning time. Since the performance of the automated FER system highly depends on its training data, and the eventual goal of the proposed robotic platform is to interact with users in an uncontrolled environment, a database of facial expressions in the wild (called AffectNet) was created by querying emotion-related keywords from different search engines. AffectNet contains more than 1M images with faces and 440,000 manually annotated images with facial expressions, valence, and arousal. Two DNNs were trained on AffectNet to classify the facial expression images and predict the value of valence and arousal. Various evaluation metrics show that our deep neural network approaches trained on AffectNet can perform better than conventional machine learning methods and available off-the-shelf FER systems. We then integrated this automated FER system into spoken dialog of our robotic platform to extend and enrich the capabilities of ExpressionBot beyond spoken dialog and create an affect-aware robotic agent that can measure and infer users\u27 affect and cognition. Three social/interaction aspects (task engagement, being empathic, and likability of the robot) are measured in an experiment with the affect-aware robotic agent. The results indicate that users rated our affect-aware agent as empathic and likable as a robot in which user\u27s affect is recognized by a human (WoZ). In summary, this dissertation presents the development and HRI studies of a perceptive, and expressive, conversational, rear-projected, life-like robotic agent (aka ExpressionBot or Ryan) that models natural face-to-face communication between human and emapthic agent. The results of our in-depth human-robot-interaction studies show that this robotic agent can serve as a model for creating the next generation of empathic social robots

    Presence 2005: the eighth annual international workshop on presence, 21-23 September, 2005 University College London (Conference proceedings)

    Get PDF
    OVERVIEW (taken from the CALL FOR PAPERS) Academics and practitioners with an interest in the concept of (tele)presence are invited to submit their work for presentation at PRESENCE 2005 at University College London in London, England, September 21-23, 2005. The eighth in a series of highly successful international workshops, PRESENCE 2005 will provide an open discussion forum to share ideas regarding concepts and theories, measurement techniques, technology, and applications related to presence, the psychological state or subjective perception in which a person fails to accurately and completely acknowledge the role of technology in an experience, including the sense of 'being there' experienced by users of advanced media such as virtual reality. The concept of presence in virtual environments has been around for at least 15 years, and the earlier idea of telepresence at least since Minsky's seminal paper in 1980. Recently there has been a burst of funded research activity in this area for the first time with the European FET Presence Research initiative. What do we really know about presence and its determinants? How can presence be successfully delivered with today's technology? This conference invites papers that are based on empirical results from studies of presence and related issues and/or which contribute to the technology for the delivery of presence. Papers that make substantial advances in theoretical understanding of presence are also welcome. The interest is not solely in virtual environments but in mixed reality environments. Submissions will be reviewed more rigorously than in previous conferences. High quality papers are therefore sought which make substantial contributions to the field. Approximately 20 papers will be selected for two successive special issues for the journal Presence: Teleoperators and Virtual Environments. PRESENCE 2005 takes place in London and is hosted by University College London. The conference is organized by ISPR, the International Society for Presence Research and is supported by the European Commission's FET Presence Research Initiative through the Presencia and IST OMNIPRES projects and by University College London

    Machine learning for automatic analysis of affective behaviour

    Get PDF
    The automated analysis of affect has been gaining rapidly increasing attention by researchers over the past two decades, as it constitutes a fundamental step towards achieving next-generation computing technologies and integrating them into everyday life (e.g. via affect-aware, user-adaptive interfaces, medical imaging, health assessment, ambient intelligence etc.). The work presented in this thesis focuses on several fundamental problems manifesting in the course towards the achievement of reliable, accurate and robust affect sensing systems. In more detail, the motivation behind this work lies in recent developments in the field, namely (i) the creation of large, audiovisual databases for affect analysis in the so-called ''Big-Data`` era, along with (ii) the need to deploy systems under demanding, real-world conditions. These developments led to the requirement for the analysis of emotion expressions continuously in time, instead of merely processing static images, thus unveiling the wide range of temporal dynamics related to human behaviour to researchers. The latter entails another deviation from the traditional line of research in the field: instead of focusing on predicting posed, discrete basic emotions (happiness, surprise etc.), it became necessary to focus on spontaneous, naturalistic expressions captured under settings more proximal to real-world conditions, utilising more expressive emotion descriptions than a set of discrete labels. To this end, the main motivation of this thesis is to deal with challenges arising from the adoption of continuous dimensional emotion descriptions under naturalistic scenarios, considered to capture a much wider spectrum of expressive variability than basic emotions, and most importantly model emotional states which are commonly expressed by humans in their everyday life. In the first part of this thesis, we attempt to demystify the quite unexplored problem of predicting continuous emotional dimensions. This work is amongst the first to explore the problem of predicting emotion dimensions via multi-modal fusion, utilising facial expressions, auditory cues and shoulder gestures. A major contribution of the work presented in this thesis lies in proposing the utilisation of various relationships exhibited by emotion dimensions in order to improve the prediction accuracy of machine learning methods - an idea which has been taken on by other researchers in the field since. In order to experimentally evaluate this, we extend methods such as the Long Short-Term Memory Neural Networks (LSTM), the Relevance Vector Machine (RVM) and Canonical Correlation Analysis (CCA) in order to exploit output relationships in learning. As it is shown, this increases the accuracy of machine learning models applied to this task. The annotation of continuous dimensional emotions is a tedious task, highly prone to the influence of various types of noise. Performed real-time by several annotators (usually experts), the annotation process can be heavily biased by factors such as subjective interpretations of the emotional states observed, the inherent ambiguity of labels related to human behaviour, the varying reaction lags exhibited by each annotator as well as other factors such as input device noise and annotation errors. In effect, the annotations manifest a strong spatio-temporal annotator-specific bias. Failing to properly deal with annotation bias and noise leads to an inaccurate ground truth, and therefore to ill-generalisable machine learning models. This deems the proper fusion of multiple annotations, and the inference of a clean, corrected version of the ``ground truth'' as one of the most significant challenges in the area. A highly important contribution of this thesis lies in the introduction of Dynamic Probabilistic Canonical Correlation Analysis (DPCCA), a method aimed at fusing noisy continuous annotations. By adopting a private-shared space model, we isolate the individual characteristics that are annotator-specific and not shared, while most importantly we model the common, underlying annotation which is shared by annotators (i.e., the derived ground truth). By further learning temporal dynamics and incorporating a time-warping process, we are able to derive a clean version of the ground truth given multiple annotations, eliminating temporal discrepancies and other nuisances. The integration of the temporal alignment process within the proposed private-shared space model deems DPCCA suitable for the problem of temporally aligning human behaviour; that is, given temporally unsynchronised sequences (e.g., videos of two persons smiling), the goal is to generate the temporally synchronised sequences (e.g., the smile apex should co-occur in the videos). Temporal alignment is an important problem for many applications where multiple datasets need to be aligned in time. Furthermore, it is particularly suitable for the analysis of facial expressions, where the activation of facial muscles (Action Units) typically follows a set of predefined temporal phases. A highly challenging scenario is when the observations are perturbed by gross, non-Gaussian noise (e.g., occlusions), as is often the case when analysing data acquired under real-world conditions. To account for non-Gaussian noise, a robust variant of Canonical Correlation Analysis (RCCA) for robust fusion and temporal alignment is proposed. The model captures the shared, low-rank subspace of the observations, isolating the gross noise in a sparse noise term. RCCA is amongst the first robust variants of CCA proposed in literature, and as we show in related experiments outperforms other, state-of-the-art methods for related tasks such as the fusion of multiple modalities under gross noise. Beyond private-shared space models, Component Analysis (CA) is an integral component of most computer vision systems, particularly in terms of reducing the usually high-dimensional input spaces in a meaningful manner pertaining to the task-at-hand (e.g., prediction, clustering). A final, significant contribution of this thesis lies in proposing the first unifying framework for probabilistic component analysis. The proposed framework covers most well-known CA methods, such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Locality Preserving Projections (LPP) and Slow Feature Analysis (SFA), providing further theoretical insights into the workings of CA. Moreover, the proposed framework is highly flexible, enabling novel CA methods to be generated by simply manipulating the connectivity of latent variables (i.e. the latent neighbourhood). As shown experimentally, methods derived via the proposed framework outperform other equivalents in several problems related to affect sensing and facial expression analysis, while providing advantages such as reduced complexity and explicit variance modelling.Open Acces
    • …
    corecore