5,031 research outputs found
Explainable Multimodal Emotion Reasoning
Multimodal emotion recognition is an active research topic in artificial
intelligence. Its primary objective is to integrate multi-modalities (such as
acoustic, visual, and lexical clues) to identify human emotional states.
Current works generally assume accurate emotion labels for benchmark datasets
and focus on developing more effective architectures. But due to the inherent
subjectivity of emotions, existing datasets often lack high annotation
consistency, resulting in potentially inaccurate labels. Consequently, models
built on these datasets may struggle to meet the demands of practical
applications. To address this issue, it is crucial to enhance the reliability
of emotion annotations. In this paper, we propose a novel task called
``\textbf{Explainable Multimodal Emotion Reasoning (EMER)}''. In contrast to
previous works that primarily focus on predicting emotions, EMER takes a step
further by providing explanations for these predictions. The prediction is
considered correct as long as the reasoning process behind the predicted
emotion is plausible. This paper presents our initial efforts on EMER, where we
introduce a benchmark dataset, establish baseline models, and define evaluation
metrics. Meanwhile, we observe the necessity of integrating multi-faceted
capabilities to deal with EMER. Therefore, we propose the first multimodal
large language model (LLM) in affective computing, called \textbf{AffectGPT}.
We aim to tackle the long-standing challenge of label ambiguity and chart a
path toward more reliable techniques. Furthermore, EMER offers an opportunity
to evaluate the audio-video-text understanding capabilities of recent
multimodal LLM. To facilitate further research, we make the code and data
available at: https://github.com/zeroQiaoba/AffectGPT
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning
Over the past few decades, multimodal emotion recognition has made remarkable
progress with the development of deep learning. However, existing technologies
are difficult to meet the demand for practical applications. To improve the
robustness, we launch a Multimodal Emotion Recognition Challenge (MER 2023) to
motivate global researchers to build innovative technologies that can further
accelerate and foster research. For this year's challenge, we present three
distinct sub-challenges: (1) MER-MULTI, in which participants recognize both
discrete and dimensional emotions; (2) MER-NOISE, in which noise is added to
test videos for modality robustness evaluation; (3) MER-SEMI, which provides
large amounts of unlabeled samples for semi-supervised learning. In this paper,
we test a variety of multimodal features and provide a competitive baseline for
each sub-challenge. Our system achieves 77.57% on the F1 score and 0.82 on the
mean squared error (MSE) for MER-MULTI, 69.82% on the F1 score and 1.12 on MSE
for MER-NOISE, and 86.75% on the F1 score for MER-SEMI, respectively. Baseline
code is available at https://github.com/zeroQiaoba/MER2023-Baseline
Facial Expression Analysis under Partial Occlusion: A Survey
Automatic machine-based Facial Expression Analysis (FEA) has made substantial
progress in the past few decades driven by its importance for applications in
psychology, security, health, entertainment and human computer interaction. The
vast majority of completed FEA studies are based on non-occluded faces
collected in a controlled laboratory environment. Automatic expression
recognition tolerant to partial occlusion remains less understood, particularly
in real-world scenarios. In recent years, efforts investigating techniques to
handle partial occlusion for FEA have seen an increase. The context is right
for a comprehensive perspective of these developments and the state of the art
from this perspective. This survey provides such a comprehensive review of
recent advances in dataset creation, algorithm development, and investigations
of the effects of occlusion critical for robust performance in FEA systems. It
outlines existing challenges in overcoming partial occlusion and discusses
possible opportunities in advancing the technology. To the best of our
knowledge, it is the first FEA survey dedicated to occlusion and aimed at
promoting better informed and benchmarked future work.Comment: Authors pre-print of the article accepted for publication in ACM
Computing Surveys (accepted on 02-Nov-2017
- …