67,125 research outputs found
Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Limited Labelled Data
This paper proposes a multimodal emotion recognition system, VIsual Spoken
Textual Additive Net (VISTA Net), to classify emotions reflected by multimodal
input containing image, speech, and text into discrete classes. A new
interpretability technique, K-Average Additive exPlanation (KAAP), has also
been developed that identifies important visual, spoken, and textual features
leading to predicting a particular emotion class. The VISTA Net fuses
information from image, speech, and text modalities using a hybrid of early and
late fusion. It automatically adjusts the weights of their intermediate outputs
while computing the weighted average. The KAAP technique computes the
contribution of each modality and corresponding features toward predicting a
particular emotion class. To mitigate the insufficiency of multimodal emotion
datasets labeled with discrete emotion classes, we have constructed a
large-scale IIT-R MMEmoRec dataset consisting of images, corresponding speech
and text, and emotion labels ('angry,' 'happy,' 'hate,' and 'sad'). The VISTA
Net has resulted in 95.99\% emotion recognition accuracy on the IIT-R MMEmoRec
dataset on using visual, audio, and textual modalities, outperforming when
using any one or two modalities
A Comparative Emotions-detection Review for Non-intrusive Vision-Based Facial Expression Recognition
Affective computing advocates for the development of systems and devices that can recognize, interpret, process, and simulate human emotion. In computing, the field seeks to enhance the user experience by finding less intrusive automated solutions. However, initiatives in this area focus on solitary emotions that limit the scalability of the approaches. Further reviews conducted in this area have also focused on solitary emotions, presenting challenges to future researchers when adopting these recommendations. This review aims at highlighting gaps in the application areas of Facial Expression Recognition Techniques by conducting a comparative analysis of various emotion detection datasets, algorithms, and results provided in existing studies. The systematic review adopted the PRISMA model and analyzed eighty-three publications. Findings from the review show that different emotions call for different Facial Expression Recognition techniques, which should be analyzed when conducting Facial Expression Recognition.
Keywords: Facial Expression Recognition, Emotion Detection, Image Processing, Computer Visio
Interpretable Explainability in Facial Emotion Recognition and Gamification for Data Collection
Training facial emotion recognition models requires large sets of data and
costly annotation processes. To alleviate this problem, we developed a gamified
method of acquiring annotated facial emotion data without an explicit labeling
effort by humans. The game, which we named Facegame, challenges the players to
imitate a displayed image of a face that portrays a particular basic emotion.
Every round played by the player creates new data that consists of a set of
facial features and landmarks, already annotated with the emotion label of the
target facial expression. Such an approach effectively creates a robust,
sustainable, and continuous machine learning training process. We evaluated
Facegame with an experiment that revealed several contributions to the field of
affective computing. First, the gamified data collection approach allowed us to
access a rich variation of facial expressions of each basic emotion due to the
natural variations in the players' facial expressions and their expressive
abilities. We report improved accuracy when the collected data were used to
enrich well-known in-the-wild facial emotion datasets and consecutively used
for training facial emotion recognition models. Second, the natural language
prescription method used by the Facegame constitutes a novel approach for
interpretable explainability that can be applied to any facial emotion
recognition model. Finally, we observed significant improvements in the facial
emotion perception and expression skills of the players through repeated game
play.Comment: 8 pages, 8 figures, 2022 10th International Conference on Affective
Computing and Intelligent Interaction (ACII
“I can haz emoshuns?”: understanding anthropomorphosis of cats among internet users
The attribution of human-like traits to non-human animals, termed anthropomorphism, can lead to misunderstandings of animal behaviour, which can result in risks to both human and animal wellbeing and welfare. In this paper, we, during an inter-disciplinary collaboration between social computing and animal behaviour researchers, investigated whether a simple image-tagging application could improve the understanding of how people ascribe intentions and emotions to the behaviour of their domestic cats. A web-based application, Tagpuss, was developed to present casual users with photographs drawn from a database of 1631 images of domestic cats and asked them to ascribe an emotion to the cat portrayed in the image. Over five thousand people actively participated in the study in the space of four weeks, generating over 50,000 tags. Results indicate Tagpuss can be used to identify cat behaviours that lay-people find difficult to distinguish. This highlights further expert scientific exploration that focuses on educating cat owners to identify possible problems with their cat’s welfare
A Wearable Emotion Detection and Video/Image Recording System
A system for sensing an important event/occasion and automatic capturing of video/image is disclosed. The system detects important event/occasion by analyzing physiological signals and derives corresponding emotional state of a user. The system includes a wearable computing device having an emotion recognition system, a sensor array and a capturing device. Alternatively, the sensor array may be attached to the arm of the user when it is not integrated with the wearable computing device. The capturing device starts recording a video/image of surroundings if an event/occasion is detected. The user presets settings of the capturing device, which includes shutter speed, aperture, ISO, autofocus, and white balance. The video/image of the surroundings is captured basis the pre-selected settings. Finally, the captured video/image files may be transferred to the user’s profile on a website or his/her personal computer over a Wi-Fi connection
A Hierarchical Emotion Classification Technique for Thai Reviews
Emotion classification is an interesting problem in affective computing that can be applied in various tasks, such as speech synthesis, image processing and text processing. With the increasing amount of textual data on the Internet, especially reviews of customers that express opinions and emotions about products. These reviews are important feedback for companies. Emotion classification aims to identify an emotion label for each review. This research investigated three approaches for emotion classification of opinions in the Thai language, written in unstructured format, free form or informal style. Different sets of features were studied in detail and analyzed. The experimental results showed that a hierarchical approach, where the subjectivity of the review is determined first, then the polarity of opinion is identified and finally the emotional label is calculated, yielded the highest performance, with precision, recall and F-measure at 0.691, 0.743 and 0.709, respectively
- …