67,115 research outputs found

    Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Limited Labelled Data

    Full text link
    This paper proposes a multimodal emotion recognition system, VIsual Spoken Textual Additive Net (VISTA Net), to classify emotions reflected by multimodal input containing image, speech, and text into discrete classes. A new interpretability technique, K-Average Additive exPlanation (KAAP), has also been developed that identifies important visual, spoken, and textual features leading to predicting a particular emotion class. The VISTA Net fuses information from image, speech, and text modalities using a hybrid of early and late fusion. It automatically adjusts the weights of their intermediate outputs while computing the weighted average. The KAAP technique computes the contribution of each modality and corresponding features toward predicting a particular emotion class. To mitigate the insufficiency of multimodal emotion datasets labeled with discrete emotion classes, we have constructed a large-scale IIT-R MMEmoRec dataset consisting of images, corresponding speech and text, and emotion labels ('angry,' 'happy,' 'hate,' and 'sad'). The VISTA Net has resulted in 95.99\% emotion recognition accuracy on the IIT-R MMEmoRec dataset on using visual, audio, and textual modalities, outperforming when using any one or two modalities

    A Comparative Emotions-detection Review for Non-intrusive Vision-Based Facial Expression Recognition

    Get PDF
    Affective computing advocates for the development of systems and devices that can recognize, interpret, process, and simulate human emotion. In computing, the field seeks to enhance the user experience by finding less intrusive automated solutions. However, initiatives in this area focus on solitary emotions that limit the scalability of the approaches. Further reviews conducted in this area have also focused on solitary emotions, presenting challenges to future researchers when adopting these recommendations. This review aims at highlighting gaps in the application areas of Facial Expression Recognition Techniques by conducting a comparative analysis of various emotion detection datasets, algorithms, and results provided in existing studies. The systematic review adopted the PRISMA model and analyzed eighty-three publications. Findings from the review show that different emotions call for different Facial Expression Recognition techniques, which should be analyzed when conducting Facial Expression Recognition. Keywords: Facial Expression Recognition, Emotion Detection, Image Processing, Computer Visio

    Interpretable Explainability in Facial Emotion Recognition and Gamification for Data Collection

    Full text link
    Training facial emotion recognition models requires large sets of data and costly annotation processes. To alleviate this problem, we developed a gamified method of acquiring annotated facial emotion data without an explicit labeling effort by humans. The game, which we named Facegame, challenges the players to imitate a displayed image of a face that portrays a particular basic emotion. Every round played by the player creates new data that consists of a set of facial features and landmarks, already annotated with the emotion label of the target facial expression. Such an approach effectively creates a robust, sustainable, and continuous machine learning training process. We evaluated Facegame with an experiment that revealed several contributions to the field of affective computing. First, the gamified data collection approach allowed us to access a rich variation of facial expressions of each basic emotion due to the natural variations in the players' facial expressions and their expressive abilities. We report improved accuracy when the collected data were used to enrich well-known in-the-wild facial emotion datasets and consecutively used for training facial emotion recognition models. Second, the natural language prescription method used by the Facegame constitutes a novel approach for interpretable explainability that can be applied to any facial emotion recognition model. Finally, we observed significant improvements in the facial emotion perception and expression skills of the players through repeated game play.Comment: 8 pages, 8 figures, 2022 10th International Conference on Affective Computing and Intelligent Interaction (ACII

    “I can haz emoshuns?”: understanding anthropomorphosis of cats among internet users

    Get PDF
    The attribution of human-like traits to non-human animals, termed anthropomorphism, can lead to misunderstandings of animal behaviour, which can result in risks to both human and animal wellbeing and welfare. In this paper, we, during an inter-disciplinary collaboration between social computing and animal behaviour researchers, investigated whether a simple image-tagging application could improve the understanding of how people ascribe intentions and emotions to the behaviour of their domestic cats. A web-based application, Tagpuss, was developed to present casual users with photographs drawn from a database of 1631 images of domestic cats and asked them to ascribe an emotion to the cat portrayed in the image. Over five thousand people actively participated in the study in the space of four weeks, generating over 50,000 tags. Results indicate Tagpuss can be used to identify cat behaviours that lay-people find difficult to distinguish. This highlights further expert scientific exploration that focuses on educating cat owners to identify possible problems with their cat’s welfare

    A Wearable Emotion Detection and Video/Image Recording System

    Get PDF
    A system for sensing an important event/occasion and automatic capturing of video/image is disclosed. The system detects important event/occasion by analyzing physiological signals and derives corresponding emotional state of a user. The system includes a wearable computing device having an emotion recognition system, a sensor array and a capturing device. Alternatively, the sensor array may be attached to the arm of the user when it is not integrated with the wearable computing device. The capturing device starts recording a video/image of surroundings if an event/occasion is detected. The user presets settings of the capturing device, which includes shutter speed, aperture, ISO, autofocus, and white balance. The video/image of the surroundings is captured basis the pre-selected settings. Finally, the captured video/image files may be transferred to the user’s profile on a website or his/her personal computer over a Wi-Fi connection

    A Hierarchical Emotion Classification Technique for Thai Reviews

    Get PDF
    Emotion classification is an interesting problem in affective computing that can be applied in various tasks, such as speech synthesis, image processing and text processing. With the increasing amount of textual data on the Internet, especially reviews of customers that express opinions and emotions about products. These reviews are important feedback for companies. Emotion classification aims to identify an emotion label for each review. This research investigated three approaches for emotion classification of opinions in the Thai language, written in unstructured format, free form or informal style. Different sets of features were studied in detail and analyzed. The experimental results showed that a hierarchical approach, where the subjectivity of the review is determined first, then the polarity of opinion is identified and finally the emotional label is calculated, yielded the highest performance, with precision, recall and F-measure at 0.691, 0.743 and 0.709, respectively
    • …
    corecore