13,423 research outputs found

    Automatic Recognition of Emotions and Membership in Group Videos

    Get PDF
    Automatic affect analysis and understanding has become a well established research area in the last two decades. However, little attention has been paid to the analysis of the affect expressed in group settings, either in the form of affect expressed by the whole group collectively or affect expressed by each individual member of the group. This paper presents a framework which, in group settings automatically classifies the affect expressed by each individual group member along both arousal and valence dimensions. We first introduce a novel Volume Quantised Local Zernike Moments Fisher Vectors (vQLZM-FV) descriptor to represent the facial behaviours of individuals in the spatio-temporal domain and then propose a method to recognize the group membership of each individual (i.e., which group the individual in question is part of) by using their face and body behavioural cues. We conduct a set of experiments on a newly collected dataset that contains fourteen recordings of four groups, each consisting of four people watching affective movie stimuli. Our experimental results show that (1) the proposed vQLZM-FV outperforms the other feature representations in affect recognition, and (2) group membership can be recognized using the non-verbal face and body features, indicating that individuals influence each other's behaviours within a group setting

    Affect Analysis and Membership Recognition in Group Settings

    Get PDF
    PhD ThesisEmotions play an important role in our day-to-day life in various ways, including, but not limited to, how we humans communicate and behave. Machines can interact with humans more naturally and intelligently if they are able to recognise and understand humans’ emotions and express their own emotions. To achieve this goal, in the past two decades, researchers have been paying a lot of attention to the analysis of affective states, which has been studied extensively across various fields, such as neuroscience, psychology, cognitive science, and computer science. Most of the existing works focus on affect analysis in individual settings, where there is one person in an image or in a video. However, in the real world, people are very often with others, or interact in group settings. In this thesis, we will focus on affect analysis in group settings. Affect analysis in group settings is different from that in individual settings and provides more challenges due to dynamic interactions between the group members, various occlusions among people in the scene, and the complex context, e.g., who people are with, where people are staying and the mutual influences among people in the group. Because of these challenges, there are still a number of open issues that need further investigation in order to advance the state of the art, and explore the methodologies for affect analysis in group settings. These open topics include but are not limited to (1) is it possible to transfer the methods used for the affect recognition of a person in individual settings to the affect recognition of each individual in group settings? (2) is it possible to recognise the affect of one individual using the expressed behaviours of another member in the same group (i.e., cross-subject affect recognition)? (3) can non-verbal behaviours be used for the recognition of contextual information in group settings? In this thesis, we investigate the affect analysis in group settings and propose methods to explore the aforementioned research questions step by step. Firstly, we propose a method for individual affect recognition in both individual and group videos, which is also used for social context prediction, i.e., whether a person is alone or within a group. Secondly, we introduce a novel framework for cross-subject affect analysis in group videos. Specifically, we analyse the correlation of the affect among group members and investigate the automatic recognition of the affect of one subject using the behaviours expressed by another subject in the same group or in a different group. Furthermore, we propose methods for contextual information prediction in group settings, i.e., group membership recognition - to recognise which group of the person belongs. Comprehensive experiments are conducted using two datasets that one contains individual videos and one contains group videos. The experimental results show that (1) the methods used for affect recognition of a person in individual settings can be transferred to group settings; (2) the affect of one subject in a group can be better predicted using the expressive behaviours of another subject within the same group than using that of a subject from a different group; and (3) contextual information (i.e., whether a person is staying alone or within a group, and group membership) can be predicted successfully using non-verbal behaviours

    Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements

    Full text link
    Emotion evoked by an advertisement plays a key role in influencing brand recall and eventual consumer choices. Automatic ad affect recognition has several useful applications. However, the use of content-based feature representations does not give insights into how affect is modulated by aspects such as the ad scene setting, salient object attributes and their interactions. Neither do such approaches inform us on how humans prioritize visual information for ad understanding. Our work addresses these lacunae by decomposing video content into detected objects, coarse scene structure, object statistics and actively attended objects identified via eye-gaze. We measure the importance of each of these information channels by systematically incorporating related information into ad affect prediction models. Contrary to the popular notion that ad affect hinges on the narrative and the clever use of linguistic and social cues, we find that actively attended objects and the coarse scene structure better encode affective information as compared to individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International Conference on Multimodal Interaction, Boulder, CO, US

    High-Level Concepts for Affective Understanding of Images

    Full text link
    This paper aims to bridge the affective gap between image content and the emotional response of the viewer it elicits by using High-Level Concepts (HLCs). In contrast to previous work that relied solely on low-level features or used convolutional neural network (CNN) as a black-box, we use HLCs generated by pretrained CNNs in an explicit way to investigate the relations/associations between these HLCs and a (small) set of Ekman's emotional classes. As a proof-of-concept, we first propose a linear admixture model for modeling these relations, and the resulting computational framework allows us to determine the associations between each emotion class and certain HLCs (objects and places). This linear model is further extended to a nonlinear model using support vector regression (SVR) that aims to predict the viewer's emotional response using both low-level image features and HLCs extracted from images. These class-specific regressors are then assembled into a regressor ensemble that provide a flexible and effective predictor for predicting viewer's emotional responses from images. Experimental results have demonstrated that our results are comparable to existing methods, with a clear view of the association between HLCs and emotional classes that is ostensibly missing in most existing work

    Personalized face and gesture analysis using hierarchical neural networks

    Full text link
    The video-based computational analyses of human face and gesture signals encompass a myriad of challenging research problems involving computer vision, machine learning and human computer interaction. In this thesis, we focus on the following challenges: a) the classification of hand and body gestures along with the temporal localization of their occurrence in a continuous stream, b) the recognition of facial expressivity levels in people with Parkinson's Disease using multimodal feature representations, c) the prediction of student learning outcomes in intelligent tutoring systems using affect signals, and d) the personalization of machine learning models, which can adapt to subject and group-specific nuances in facial and gestural behavior. Specifically, we first conduct a quantitative comparison of two approaches to the problem of segmenting and classifying gestures on two benchmark gesture datasets: a method that simultaneously segments and classifies gestures versus a cascaded method that performs the tasks sequentially. Second, we introduce a framework that computationally predicts an accurate score for facial expressivity and validate it on a dataset of interview videos of people with Parkinson's disease. Third, based on a unique dataset of videos of students interacting with MathSpring, an intelligent tutoring system, collected by our collaborative research team, we build models to predict learning outcomes from their facial affect signals. Finally, we propose a novel solution to a relatively unexplored area in automatic face and gesture analysis research: personalization of models to individuals and groups. We develop hierarchical Bayesian neural networks to overcome the challenges posed by group or subject-specific variations in face and gesture signals. We successfully validate our formulation on the problems of personalized subject-specific gesture classification, context-specific facial expressivity recognition and student-specific learning outcome prediction. We demonstrate the flexibility of our hierarchical framework by validating the utility of both fully connected and recurrent neural architectures

    Bridging the empathy gap: Effects of brief mindfulness training on helping outgroup members in need

    Get PDF
    Witnessing others in need can be felt similarly to experiencing it oneself (empathy) and motivates assistance of those in need (prosocial action). It is well-documented that empathy can occur automatically, but when those in need are not members of a social ingroup, empathy and prosocial action are undermined. One major ingroup—outgroup division in American and in other countries is based on race. Although most condemn racial discrimination, empathy and prosocial action are often lower, however unintentionally, in interracial contexts. In light of this empathy gap, it is important to identify psychological factors that could bolster empathy and prosocial action toward racial outgroup members in need. This dissertation asked whether mindfulness training – cultivating present-centered, receptive attention to one’s ongoing experiences –increases social sensitivity toward racial outgroup members, and is based on pilot research indicating that a brief mindfulness induction increased empathy and prosocial action in such contexts. Healthy, self-identifying White women were randomized to either a brief (4-day) mindfulness training or a structurally-equivalent sham mindfulness training. Pre-post electroencephalographic measures of empathy toward video stimuli of outgroup members expressing sadness was assessed via prefrontal alpha frequency oscillations (i.e., frontal alpha asymmetry). Pre-post scenario-based spontaneous prosocial action toward Black individuals in need, and pre-post 14-day ecological momentary assessment (EMA) of empathy and prosocial action toward Black individuals (and other races) were conducted. Mindfulness training was expected to increase EEG- and EMA-based empathy toward Black individuals in need, as well as increase prosocial action toward such individuals in scenario and daily life (EMA) contexts. Opposite of what was hypothesized, MT reduced post-intervention empathic simulation, relative to ST, as measured by frontal alpha asymmetry. Consistent with hypotheses, however, MT increased empathic concern for outgroup members expressing sadness during video stimuli observation, and increased post-intervention scenario-based prosocial action. However, the hypothesis that MT would predict increases in pre- to post-intervention daily EMA-based prosocial action was not supported. Providing somewhat convergent evidence, trait mindfulness predicted more frequent pre-intervention scenario-based and daily prosocial action toward outgroup members; trait mindfulness was not related to pre-intervention video-based EEG and self-reported empathy outcomes. Together these results suggest that mindfulness can enhance some indicators or empathy and prosocial behavior in interracial contexts. Mechanisms and implications of the findings are discussed

    Your fellows matter: Affect analysis across subjects in group videos

    Get PDF
    Automatic affect analysis has become a well established research area in the last two decades. Recent works have started moving from individual to group scenarios. However, little attention has been paid to investigating how individuals in a group influence the affective states of each other. In this paper, we propose a novel framework for cross-subjects affect analysis in group videos. Specifically, we analyze the correlation of the affect among group members and investigate the automatic recognition of the affect of one subject using the behaviours expressed by another subject in the same group. A set of experiments are conducted using a recently collected database aimed at affect analysis in group settings. Our results show that (1) people in the same group do share more information in terms of behaviours and emotions than people in different groups; and (2) the affect of one subject in a group can be better predicted using the expressive behaviours of another subject within the same group than using that of a subject from a different group. This work is of great importance for affect recognition in group settings: when the information of one subject is unavailable due to occlusion, head/body poses etc., we can predict his/her affect by employing the expressive behaviours of the other subject(s).European Unions Horizon 202

    Utilising semantic technologies for intelligent indexing and retrieval of digital images

    Get PDF
    The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they in principle rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this paper we present a semantically-enabled image annotation and retrieval engine that is designed to satisfy the requirements of the commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as the exploitation of lexical databases for explicit semantic-based query expansion
    corecore