1,662 research outputs found

    Recognising Complex Mental States from Naturalistic Human-Computer Interactions

    Get PDF
    New advances in computer vision techniques will revolutionize the way we interact with computers, as they, together with other improvements, will help us build machines that understand us better. The face is the main non-verbal channel for human-human communication and contains valuable information about emotion, mood, and mental state. Affective computing researchers have investigated widely how facial expressions can be used for automatically recognizing affect and mental states. Nowadays, physiological signals can be measured by video-based techniques, which can also be utilised for emotion detection. Physiological signals, are an important indicator of internal feelings, and are more robust against social masking. This thesis focuses on computer vision techniques to detect facial expression and physiological changes for recognizing non-basic and natural emotions during human-computer interaction. It covers all stages of the research process from data acquisition, integration and application. Most previous studies focused on acquiring data from prototypic basic emotions acted out under laboratory conditions. To evaluate the proposed method under more practical conditions, two different scenarios were used for data collection. In the first scenario, a set of controlled stimulus was used to trigger the user’s emotion. The second scenario aimed at capturing more naturalistic emotions that might occur during a writing activity. In the second scenario, the engagement level of the participants with other affective states was the target of the system. For the first time this thesis explores how video-based physiological measures can be used in affect detection. Video-based measuring of physiological signals is a new technique that needs more improvement to be used in practical applications. A machine learning approach is proposed and evaluated to improve the accuracy of heart rate (HR) measurement using an ordinary camera during a naturalistic interaction with computer

    Recognising Complex Mental States from Naturalistic Human-Computer Interactions

    Get PDF
    New advances in computer vision techniques will revolutionize the way we interact with computers, as they, together with other improvements, will help us build machines that understand us better. The face is the main non-verbal channel for human-human communication and contains valuable information about emotion, mood, and mental state. Affective computing researchers have investigated widely how facial expressions can be used for automatically recognizing affect and mental states. Nowadays, physiological signals can be measured by video-based techniques, which can also be utilised for emotion detection. Physiological signals, are an important indicator of internal feelings, and are more robust against social masking. This thesis focuses on computer vision techniques to detect facial expression and physiological changes for recognizing non-basic and natural emotions during human-computer interaction. It covers all stages of the research process from data acquisition, integration and application. Most previous studies focused on acquiring data from prototypic basic emotions acted out under laboratory conditions. To evaluate the proposed method under more practical conditions, two different scenarios were used for data collection. In the first scenario, a set of controlled stimulus was used to trigger the user’s emotion. The second scenario aimed at capturing more naturalistic emotions that might occur during a writing activity. In the second scenario, the engagement level of the participants with other affective states was the target of the system. For the first time this thesis explores how video-based physiological measures can be used in affect detection. Video-based measuring of physiological signals is a new technique that needs more improvement to be used in practical applications. A machine learning approach is proposed and evaluated to improve the accuracy of heart rate (HR) measurement using an ordinary camera during a naturalistic interaction with computer

    Explainable automated recognition of emotional states from canine facial expressions: the case of positive anticipation and frustration.

    Get PDF
    In animal research, automation of affective states recognition has so far mainly addressed pain in a few species. Emotional states remain uncharted territories, especially in dogs, due to the complexity of their facial morphology and expressions. This study contributes to fill this gap in two aspects. First, it is the first to address dog emotional states using a dataset obtained in a controlled experimental setting, including videos from (n = 29) Labrador Retrievers assumed to be in two experimentally induced emotional states: negative (frustration) and positive (anticipation). The dogs' facial expressions were measured using the Dogs Facial Action Coding System (DogFACS). Two different approaches are compared in relation to our aim: (1) a DogFACS-based approach with a two-step pipeline consisting of (i) a DogFACS variable detector and (ii) a positive/negative state Decision Tree classifier; (2) An approach using deep learning techniques with no intermediate representation. The approaches reach accuracy of above 71% and 89%, respectively, with the deep learning approach performing better. Secondly, this study is also the first to study explainability of AI models in the context of emotion in animals. The DogFACS-based approach provides decision trees, that is a mathematical representation which reflects previous findings by human experts in relation to certain facial expressions (DogFACS variables) being correlates of specific emotional states. The deep learning approach offers a different, visual form of explainability in the form of heatmaps reflecting regions of focus of the network's attention, which in some cases show focus clearly related to the nature of particular DogFACS variables. These heatmaps may hold the key to novel insights on the sensitivity of the network to nuanced pixel patterns reflecting information invisible to the human eye

    Affective Computing

    Get PDF
    This book provides an overview of state of the art research in Affective Computing. It presents new ideas, original results and practical experiences in this increasingly important research field. The book consists of 23 chapters categorized into four sections. Since one of the most important means of human communication is facial expression, the first section of this book (Chapters 1 to 7) presents a research on synthesis and recognition of facial expressions. Given that we not only use the face but also body movements to express ourselves, in the second section (Chapters 8 to 11) we present a research on perception and generation of emotional expressions by using full-body motions. The third section of the book (Chapters 12 to 16) presents computational models on emotion, as well as findings from neuroscience research. In the last section of the book (Chapters 17 to 22) we present applications related to affective computing

    Evaluating child engagement in digital story stems using facial data

    Get PDF
    Engagement is a key factor in understanding people’s psychology and behaviours and is an understudied topic in children. The area of focus in this thesis is child engagement in the story-stems used in child Attachment evaluations such as the Manchester Child Attachment Task (MCAST). Due to the high cost and time required for conducting Attachment assessments, automated assessments are being developed. These present story-stems in a cost-effective way on a laptop screen to digitalise the interaction between the child and the story, without disrupting the storytelling. However, providing such tests via computer relies on the child being engaged in the digital story-stem. If they are not engaged, then the tests will not be successful and the collected data will be of poor-quality, which will not allow for successful detection of Attachment status. Therefore, the aim of this research is to investigate a range of aspects of child engagement to understand how to engage children in story-stems, and how to measure their engagement levels. This thesis focuses on measuring the levels of child engagement in digital story-stems and specifically on understanding the effect of multimedia digital story-stems on children’s engagement levels to create a better and more engaging digital story-stem. Data sources used in this thesis include the observation of each child’s facial behaviours and a questionnaire with Smiley-o-meter scale. Measurement tools are developed and validated through analyses of facial data from children when watching digital story-stems with different presentation and voice types. Results showed that facial data analysis, using eye-tracking measures and facial action units (AUs) recognition, can be used to measure children’s engagement levels in the context of viewing digital story-stems. Using eye-tracking measures, engaged children have longer fixation durations in both mean and sum of fixation durations, which reflect that a child was deeply engaged in the story-stems. Facial AU recognition had better performance in a binary classification for discriminating engaged or disengaged children than eye-tracking measurements. The most frequently occurring facial action units taken from the engaged classes show that children’s facial action units indicated signs of fear, which suggest that children felt anxiety and distress while watching the story-stems. These feeling of anxiety and distress show that children have a strong emotional engagement and can locate themselves in the story-stems, showing that they were strong engaged. A further contribution in this thesis was to investigate the best way of creating an engaging story-stem. Results showed that an animated video narrated by a female expressive voice was most engaging. Compared to the live-action MCAST video, data showed that children were more engaged in the animated videos. Voice gender and voice expressiveness were two factors of the quality of storytelling voice that were evaluated and both affected children’s engagement levels. The distribution of child engagement across different voice types was compared to find the best storytelling voice type for story-stem design. A female expressive voice had a better performance for displaying the ‘distress’ in the story-stem than other voice types and engaged children more in the story-stems. The quality of the storytelling voice used to narrate story-stems and animated videos both significantly affected children’s levels of engagement. Such digital story-stems make children more engaged in the digital MCAST test

    Facial expression recognition in the wild : from individual to group

    Get PDF
    The progress in computing technology has increased the demand for smart systems capable of understanding human affect and emotional manifestations. One of the crucial factors in designing systems equipped with such intelligence is to have accurate automatic Facial Expression Recognition (FER) methods. In computer vision, automatic facial expression analysis is an active field of research for over two decades now. However, there are still a lot of questions unanswered. The research presented in this thesis attempts to address some of the key issues of FER in challenging conditions mentioned as follows: 1) creating a facial expressions database representing real-world conditions; 2) devising Head Pose Normalisation (HPN) methods which are independent of facial parts location; 3) creating automatic methods for the analysis of mood of group of people. The central hypothesis of the thesis is that extracting close to real-world data from movies and performing facial expression analysis on movies is a stepping stone in the direction of moving the analysis of faces towards real-world, unconstrained condition. A temporal facial expressions database, Acted Facial Expressions in the Wild (AFEW) is proposed. The database is constructed and labelled using a semi-automatic process based on closed caption subtitle based keyword search. Currently, AFEW is the largest facial expressions database representing challenging conditions available to the research community. For providing a common platform to researchers in order to evaluate and extend their state-of-the-art FER methods, the first Emotion Recognition in the Wild (EmotiW) challenge based on AFEW is proposed. An image-only based facial expressions database Static Facial Expressions In The Wild (SFEW) extracted from AFEW is proposed. Furthermore, the thesis focuses on HPN for real-world images. Earlier methods were based on fiducial points. However, as fiducial points detection is an open problem for real-world images, HPN can be error-prone. A HPN method based on response maps generated from part-detectors is proposed. The proposed shape-constrained method does not require fiducial points and head pose information, which makes it suitable for real-world images. Data from movies and the internet, representing real-world conditions poses another major challenge of the presence of multiple subjects to the research community. This defines another focus of this thesis where a novel approach for modeling the perception of mood of a group of people in an image is presented. A new database is constructed from Flickr based on keywords related to social events. Three models are proposed: averaging based Group Expression Model (GEM), Weighted Group Expression Model (GEM_w) and Augmented Group Expression Model (GEM_LDA). GEM_w is based on social contextual attributes, which are used as weights on each person's contribution towards the overall group's mood. Further, GEM_LDA is based on topic model and feature augmentation. The proposed framework is applied to applications of group candid shot selection and event summarisation. The application of Structural SIMilarity (SSIM) index metric is explored for finding similar facial expressions. The proposed framework is applied to the problem of creating image albums based on facial expressions, finding corresponding expressions for training facial performance transfer algorithms
    • …
    corecore