153,902 research outputs found

    CAKE: Compact and Accurate K-dimensional representation of Emotion

    Get PDF
    Numerous models describing the human emotional states have been built by the psychology community. Alongside, Deep Neural Networks (DNN) are reaching excellent performances and are becoming interesting features extraction tools in many computer vision tasks.Inspired by works from the psychology community, we first study the link between the compact two-dimensional representation of the emotion known as arousal-valence, and discrete emotion classes (e.g. anger, happiness, sadness, etc.) used in the computer vision community. It enables to assess the benefits -- in terms of discrete emotion inference -- of adding an extra dimension to arousal-valence (usually named dominance). Building on these observations, we propose CAKE, a 3-dimensional representation of emotion learned in a multi-domain fashion, achieving accurate emotion recognition on several public datasets. Moreover, we visualize how emotions boundaries are organized inside DNN representations and show that DNNs are implicitly learning arousal-valence-like descriptions of emotions. Finally, we use the CAKE representation to compare the quality of the annotations of different public datasets

    Using fuzzy logic to handle the semantic descriptions of music in a content-based retrieval system

    Get PDF
    This paper explores the potential use of fuzzy logic for semantic music recommendation. We show that a set of affective/emotive, structural and kinaesthetic descriptors can be used to formulate a query which allows the retrieval of intended music. A semantic music recommendation system was built, based on an elaborate study of potential users and an analysis of the semantic descriptors that best characterize the user’s understanding of music. Significant relationships between expressive and structural semantic descriptions of music were found. Fuzzy logic was then applied to handle the quality ratings associated with the semantic descriptions. A working semantic music recommendation system was tested and evaluated. Real-world testing revealed high user satisfaction

    Emotional State Categorization from Speech: Machine vs. Human

    Full text link
    This paper presents our investigations on emotional state categorization from speech signals with a psychologically inspired computational model against human performance under the same experimental setup. Based on psychological studies, we propose a multistage categorization strategy which allows establishing an automatic categorization model flexibly for a given emotional speech categorization task. We apply the strategy to the Serbian Emotional Speech Corpus (GEES) and the Danish Emotional Speech Corpus (DES), where human performance was reported in previous psychological studies. Our work is the first attempt to apply machine learning to the GEES corpus where the human recognition rates were only available prior to our study. Unlike the previous work on the DES corpus, our work focuses on a comparison to human performance under the same experimental settings. Our studies suggest that psychology-inspired systems yield behaviours that, to a great extent, resemble what humans perceived and their performance is close to that of humans under the same experimental setup. Furthermore, our work also uncovers some differences between machine and humans in terms of emotional state recognition from speech.Comment: 14 pages, 15 figures, 12 table

    Four not six: revealing culturally common facial expressions of emotion

    Get PDF
    As a highly social species, humans generate complex facial expressions to communicate a diverse range of emotions. Since Darwin’s work, identifying amongst these complex patterns which are common across cultures and which are culture-specific has remained a central question in psychology, anthropology, philosophy, and more recently machine vision and social robotics. Classic approaches to addressing this question typically tested the cross-cultural recognition of theoretically motivated facial expressions representing six emotions, and reported universality. Yet, variable recognition accuracy across cultures suggests a narrower cross-cultural communication, supported by sets of simpler expressive patterns embedded in more complex facial expressions. We explore this hypothesis by modelling the facial expressions of over 60 emotions across two cultures, and segregating out the latent expressive patterns. Using a multi-disciplinary approach, we first map the conceptual organization of a broad spectrum of emotion words by building semantic networks in two cultures. For each emotion word in each culture, we then model and validate its corresponding dynamic facial expression, producing over 60 culturally valid facial expression models. We then apply to the pooled models a multivariate data reduction technique, revealing four latent and culturally common facial expression patterns that each communicates specific combinations of valence, arousal and dominance. We then reveal the face movements that accentuate each latent expressive pattern to create complex facial expressions. Our data questions the widely held view that six facial expression patterns are universal, instead suggesting four latent expressive patterns with direct implications for emotion communication, social psychology, cognitive neuroscience, and social robotics

    Using fuzzy logic to handle the users' semantic descriptions in a music retrieval system

    Get PDF
    This paper provides an investigation of the potential application of fuzzy logic to semantic music recommendation. We show that a set of affective/emotive, structural and kinaesthetic descriptors can be used to formulate a query which allows the retrieval of intended music. A semantic music recommendation system was built, based on an elaborate study of potential users of music information retrieval systems. In this study analysis was made of the descriptors that best characterize the user's understanding of music. Significant relationships between expressive and structural descriptions of music were found. A straightforward fuzzy logic methodology was then applied to handle the quality ratings associated with the descriptions. Rigorous real-world testing of the semantic music recommendation system revealed high user satisfaction

    EMPATH: A Neural Network that Categorizes Facial Expressions

    Get PDF
    There are two competing theories of facial expression recognition. Some researchers have suggested that it is an example of "categorical perception." In this view, expression categories are considered to be discrete entities with sharp boundaries, and discrimination of nearby pairs of expressive faces is enhanced near those boundaries. Other researchers, however, suggest that facial expression perception is more graded and that facial expressions are best thought of as points in a continuous, low-dimensional space, where, for instance, "surprise" expressions lie between "happiness" and "fear" expressions due to their perceptual similarity. In this article, we show that a simple yet biologically plausible neural network model, trained to classify facial expressions into six basic emotions, predicts data used to support both of these theories. Without any parameter tuning, the model matches a variety of psychological data on categorization, similarity, reaction times, discrimination, and recognition difficulty, both qualitatively and quantitatively. We thus explain many of the seemingly complex psychological phenomena related to facial expression perception as natural consequences of the tasks' implementations in the brain
    • …
    corecore