16,450 research outputs found

    Keeping track of emotions:audiovisual integration for emotion recognition and compensation for sensory degradations captured by perceptual strategies

    Get PDF
    The majority of emotional expressions are multimodal and dynamic in nature. Emotion recognition, therefore, requires integration of these multimodal signals. Sensory impairments likely affect emotion recognition, but although sensory impairments are common in older adults, it is unknown how they affect emotion recognition. As more people reach old age, accompanied by an increase in the prevalence of sensory impairments, it is urgent to comprehensively understand audiovisual integration, especially in older individuals. My thesis sought to create a basic understanding of audiovisual integration for emotion recognition and study how audiovisual interactions change with simulated sensory impairments. A secondary aim was to understand how age affects these outcomes. To systematically address these aims, I examined how well observers recognize emotions, presented via videos, and how emotion recognition accuracy and perceptual strategies, assessed via eye-tracking, vary under changing availability and reliability of the visual and auditory information. The research presented in my thesis shows that audiovisual integration and compensation abilities remain intact with age, despite a general decline in recognition accuracy. Compensation for degraded audio is possible by relying more on visual signals, but not vice versa. Older observers adapt their perceptual strategies in a different, perhaps less efficient, manner than younger observers. Importantly, I demonstrate that it is crucial to use additional measurements besides recognition accuracy in order to understand audiovisual integration mechanisms. Measurements such as eye-tracking allow examining whether the reliance on visual and auditory signals alters with age and (simulated) sensory impairments, even when lacking a change in accuracy

    Arousal and Valence Prediction in Spontaneous Emotional Speech: Felt versus Perceived Emotion

    Get PDF
    In this paper, we describe emotion recognition experiments carried out for spontaneous affective speech with the aim to compare the added value of annotation of felt emotion versus annotation of perceived emotion. Using speech material available in the TNO-GAMING corpus (a corpus containing audiovisual recordings of people playing videogames), speech-based affect recognizers were developed that can predict Arousal and Valence scalar values. Two types of recognizers were developed in parallel: one trained with felt emotion annotations (generated by the gamers themselves) and one trained with perceived/observed emotion annotations (generated by a group of observers). The experiments showed that, in speech, with the methods and features currently used, observed emotions are easier to predict than felt emotions. The results suggest that recognition performance strongly depends on how and by whom the emotion annotations are carried out. \u

    Audiovisual integration of emotional signals from others' social interactions

    Get PDF
    Audiovisual perception of emotions has been typically examined using displays of a solitary character (e.g., the face-voice and/or body-sound of one actor). However, in real life humans often face more complex multisensory social situations, involving more than one person. Here we ask if the audiovisual facilitation in emotion recognition previously found in simpler social situations extends to more complex and ecological situations. Stimuli consisting of the biological motion and voice of two interacting agents were used in two experiments. In Experiment 1, participants were presented with visual, auditory, auditory filtered/noisy, and audiovisual congruent and incongruent clips. We asked participants to judge whether the two agents were interacting happily or angrily. In Experiment 2, another group of participants repeated the same task, as in Experiment 1, while trying to ignore either the visual or the auditory information. The findings from both experiments indicate that when the reliability of the auditory cue was decreased participants weighted more the visual cue in their emotional judgments. This in turn translated in increased emotion recognition accuracy for the multisensory condition. Our findings thus point to a common mechanism of multisensory integration of emotional signals irrespective of social stimulus complexity

    Emotion recognition ability in English among L1 and LX users of English

    Get PDF
    focuses on individual differences in emotion recognition ability among 356 first language (L1) and 564 foreign language (LX) users of English. Recognizing emotions can be particularly challenging in LX contexts. Depending on their linguistic profile, individuals may interpret input very differently, and LX learners and users have been found to perform significantly worse than native control groups (Rintell 1984) in tests of emotion recognition ability. In the present article, we investigate the effect of three independent variables, namely, L1 versus LX status, proficiency in English, and cultural background, on emotion recognition ability. We used an online survey in which participants had to identify the emotion portrayed by a native English-speaking actress in six audiovisual clips. Despite LX users having lower proficiency scores, English-L1 users and LX users’ emotion recognition ability scores were broadly similar. A significant positive relationship was found between LX proficiency and emotion recognition ability. A similar but only marginally significant relationship emerged among L1 users. A significant effect of L1 culture was found on emotion recognition ability scores, with Asian LX users scoring significantly lower than European LX users. It thus seems that audiovisual input allows advanced LX users to recognize emotions in LX as well as L1 users. That said, LX proficiency and L1 culture do have an effect on emotion recognition ability

    AVEC 2019 workshop and challenge: state-of-mind, detecting depression with AI, and cross-cultural affect recognition

    Get PDF
    The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) "State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect Recognition" is the ninth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions. The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the health and emotion recognition communities, as well as the audiovisual processing communities, to compare the relative merits of various approaches to health and emotion recognition from real-life data. This paper presents the major novelties introduced this year, the challenge guidelines, the data used, and the performance of the baseline systems on the three proposed tasks: state-of-mind recognition, depression assessment with AI, and cross-cultural affect sensing, respectively

    AVEC 2011 – the first international Audio/Visual Emotion Challenge

    Get PDF
    Abstract. The Audio/Visual Emotion Challenge andWorkshop (AVEC 2011) is the first competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and audiovisual emotion analysis, with all participants competing under strictly the same conditions. This paper first describes the challenge par-ticipation conditions. Next follows the data used – the SEMAINE corpus – and its partitioning into train, development, and test partitions for the challenge with labelling in four dimensions, namely activity, expectation, power, and valence. Further, audio and video baseline features are intro-duced as well as baseline results that use these features for the three sub-challenges of audio, video, and audiovisual emotion recognition

    AVEC 2017--Real-life depression, and affect recognition workshop and challenge

    Get PDF
    The Audio/Visual Emotion Challenge and Workshop (AVEC 2017) “Real-life depression, and affect” will be the seventh competition event aimed at comparison of multimedia processing and machine learning methods for automatic audiovisual depression and emotion analysis, with all participants competing under strictly the same conditions. .e goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the depression and emotion recognition communities, as well as the audiovisual processing communities, to compare the relative merits of the various approaches to depression and emotion recognition from real-life data. .is paper presents the novelties introduced this year, the challenge guidelines, the data used, and the performance of the baseline system on the two proposed tasks: dimensional emotion recognition (time and value-continuous), and dimensional depression estimation (value-continuous)
    corecore