13,814 research outputs found

    Learning spontaneity to improve emotion recognition in speech

    Get PDF
    We investigate the effect and usefulness of spontaneity (i.e. whether a given speech is spontaneous or not) in speech in the context of emotion recognition. We hypothesize that emotional content in speech is interrelated with its spontaneity, and use spontaneity classification as an auxiliary task to the problem of emotion recognition. We propose two supervised learning settings that utilize spontaneity to improve speech emotion recognition: a hierarchical model that performs spontaneity detection before performing emotion recognition, and a multitask learning model that jointly learns to recognize both spontaneity and emotion. Through various experiments on the well known IEMOCAP database, we show that by using spontaneity detection as an additional task, significant improvement can be achieved over emotion recognition systems that are unaware of spontaneity. We achieve state-of-the-art emotion recognition accuracy (4-class, 69.1%) on the IEMOCAP database outperforming several relevant and competitive baselines

    Controlling for Confounders in Multimodal Emotion Classification via Adversarial Learning

    Full text link
    Various psychological factors affect how individuals express emotions. Yet, when we collect data intended for use in building emotion recognition systems, we often try to do so by creating paradigms that are designed just with a focus on eliciting emotional behavior. Algorithms trained with these types of data are unlikely to function outside of controlled environments because our emotions naturally change as a function of these other factors. In this work, we study how the multimodal expressions of emotion change when an individual is under varying levels of stress. We hypothesize that stress produces modulations that can hide the true underlying emotions of individuals and that we can make emotion recognition algorithms more generalizable by controlling for variations in stress. To this end, we use adversarial networks to decorrelate stress modulations from emotion representations. We study how stress alters acoustic and lexical emotional predictions, paying special attention to how modulations due to stress affect the transferability of learned emotion recognition models across domains. Our results show that stress is indeed encoded in trained emotion classifiers and that this encoding varies across levels of emotions and across the lexical and acoustic modalities. Our results also show that emotion recognition models that control for stress during training have better generalizability when applied to new domains, compared to models that do not control for stress during training. We conclude that is is necessary to consider the effect of extraneous psychological factors when building and testing emotion recognition models.Comment: 10 pages, ICMI 201

    Affective games:a multimodal classification system

    Get PDF
    Affective gaming is a relatively new field of research that exploits human emotions to influence gameplay for an enhanced player experience. Changes in player’s psychology reflect on their behaviour and physiology, hence recognition of such variation is a core element in affective games. Complementary sources of affect offer more reliable recognition, especially in contexts where one modality is partial or unavailable. As a multimodal recognition system, affect-aware games are subject to the practical difficulties met by traditional trained classifiers. In addition, inherited game-related challenges in terms of data collection and performance arise while attempting to sustain an acceptable level of immersion. Most existing scenarios employ sensors that offer limited freedom of movement resulting in less realistic experiences. Recent advances now offer technology that allows players to communicate more freely and naturally with the game, and furthermore, control it without the use of input devices. However, the affective game industry is still in its infancy and definitely needs to catch up with the current life-like level of adaptation provided by graphics and animation

    Habitual reflexivity and skilled action

    Get PDF
    Theorists have used the concept habitus to explain how skilled agents are capable of responding in an infinite number of ways to the infinite number of possible situations that they encounter in their field of practice. According to some perspectives, habitus is seen to represent a form of regulated improvisation that functions below the threshold of consciousness. However, Bourdieu (1990) argued that rational and conscious computation may be required in situations of ‘crises’ where habitus proves insufficient as a basis for our actions. In the current paper, I draw on a range of evidence which indicates that conscious intervention (including self-reflective sensory consciousness) is required not only at points of crises but also as skilled performers engage in the mundane actions/practices that characterise their everyday training and performance regimes. The interaction of conscious learning and unconscious schemata leads to the development of a reflexive habitus which allows performers to refine and adapt embodied movement patterns over time

    Automatic Identification of Emotional Information in Spanish TV Debates and Human-Machine Interactions

    Get PDF
    Automatic emotion detection is a very attractive field of research that can help build more natural human–machine interaction systems. However, several issues arise when real scenarios are considered, such as the tendency toward neutrality, which makes it difficult to obtain balanced datasets, or the lack of standards for the annotation of emotional categories. Moreover, the intrinsic subjectivity of emotional information increases the difficulty of obtaining valuable data to train machine learning-based algorithms. In this work, two different real scenarios were tackled: human–human interactions in TV debates and human–machine interactions with a virtual agent. For comparison purposes, an analysis of the emotional information was conducted in both. Thus, a profiling of the speakers associated with each task was carried out. Furthermore, different classification experiments show that deep learning approaches can be useful for detecting speakers’ emotional information, mainly for arousal, valence, and dominance levels, reaching a 0.7F1-score.The research presented in this paper was conducted as part of the AMIC and EMPATHIC projects, which received funding from the Spanish Minister of Science under grants TIN2017-85854-C4-3-R and PDC2021-120846-C43 and from the European Union’s Horizon 2020 research and innovation program under grant agreement No. 769872. The first author also received a PhD scholarship from the University of the Basque Country UPV/EHU, PIF17/310

    The recognition of acted interpersonal stance in police interrogations and the influence of actor proficiency

    Get PDF
    This paper reports on judgement studies regarding the perception of interpersonal stances taken by humans playing the role of a suspect in a police interrogation setting. Our project aims at building believable embodied conversational characters to play the role of suspects in a serious game for learning interrogation strategies. The main question we ask is: do human judges agree on the way they perceive the various aspects of stance taking, such as friendliness and dominance? Four types of stances were acted by eight amateur actors. Short recordings were shown in an online survey to subjects who were asked to describe them using a selection of a number of adjectives. Results of this annotation task are reported in this paper. We explain how we computed the inter-rater agreement with Krippendorff’s alpha statistics using a set theoretical distance metric. Results show that for some of the stance types observers agreed more than for others. Some actors are better than others, but validity (recognizing the intended stance) and inter-rater agreement do not always go hand in hand. We further investigate the effect the expertise of actors has on the perception of the stance that is acted. We compare the fragments from amateur actors to fragments from professional actors taken from popular TV-shows

    Speech emotion recognition in Spanish TV Debates

    Get PDF
    Emotion recognition from speech is an active field of study that can help build more natural human-machine interaction systems. Even though the advancement of deep learning technology has brought improvements in this task, it is still a very challenging field. For instance, when considering real life scenarios, things such as tendency toward neutrality or the ambiguous definition of emotion can make labeling a difficult task causing the data-set to be severally imbalanced and not very representative. In this work we considered a real life scenario to carry out a series of emotion classification experiments. Specifically, we worked with a labeled corpus consisting of a set of audios from Spanish TV debates and their respective transcriptions. First, an analysis of the emotional information within the corpus was conducted. Then different data representations were analyzed as to choose the best one for our task; Spectrograms and UniSpeech-SAT were used for audio representation and DistilBERT for text representation. As a final step, Multimodal Machine Learning was used with the aim of improving the obtained classification results by combining acoustic and textual information.The research presented in this paper was conducted as part of the AMIC PdC project, which received funding from the Spanish Ministry of Science under grants TIN2017-85854-C4- 3-R, PID2021-126061OB-C42 and PDC2021-120846-C43 and it was also partially funded by the European Union’s Horizon 2020 research and innovation program under grant agreement No. 823907 (MENHIR)

    Emotions Management within Organizations

    Get PDF
    Emotions management in organizations is meant to habilitate the employees in administrating the emotional resources aiming at the correct adaptation to the organizational environment and the necessities in the work activity. The study of emotions in organizations has the purpose to know and optimize the employees’ emotional condition. The efficient leaders are interested in administrating the emotions, being aware of and capable to revaluate the factors which positively activate the employees emotional life. Emotions management is accomplished at two more important levels: personal level or subjective (represented by the person’s self-control capacity, the emotional intelligence, the ability to administrate the positive and negative emotions) and an interpersonal or social level, centered upon settling the emotional changes between employees and leaders, between employees and clients. From their settling into the practice point of view, the increase in the work performance and the benefits brought to the organizational environment, the concepts by which emotions management is accomplished/operate (positive emotions and negative emotions, emotional intelligence, emotional self-control, emotional labour etc.), this issue presents greater interest both for theorists and for the real doers/practitioners.emotions management, emotional labour, emotional contagion, emotional intelligence, organizational group

    Recognising Complex Mental States from Naturalistic Human-Computer Interactions

    Get PDF
    New advances in computer vision techniques will revolutionize the way we interact with computers, as they, together with other improvements, will help us build machines that understand us better. The face is the main non-verbal channel for human-human communication and contains valuable information about emotion, mood, and mental state. Affective computing researchers have investigated widely how facial expressions can be used for automatically recognizing affect and mental states. Nowadays, physiological signals can be measured by video-based techniques, which can also be utilised for emotion detection. Physiological signals, are an important indicator of internal feelings, and are more robust against social masking. This thesis focuses on computer vision techniques to detect facial expression and physiological changes for recognizing non-basic and natural emotions during human-computer interaction. It covers all stages of the research process from data acquisition, integration and application. Most previous studies focused on acquiring data from prototypic basic emotions acted out under laboratory conditions. To evaluate the proposed method under more practical conditions, two different scenarios were used for data collection. In the first scenario, a set of controlled stimulus was used to trigger the user’s emotion. The second scenario aimed at capturing more naturalistic emotions that might occur during a writing activity. In the second scenario, the engagement level of the participants with other affective states was the target of the system. For the first time this thesis explores how video-based physiological measures can be used in affect detection. Video-based measuring of physiological signals is a new technique that needs more improvement to be used in practical applications. A machine learning approach is proposed and evaluated to improve the accuracy of heart rate (HR) measurement using an ordinary camera during a naturalistic interaction with computer
    corecore