5,339 research outputs found
Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives
Over the past few years, adversarial training has become an extremely active
research topic and has been successfully applied to various Artificial
Intelligence (AI) domains. As a potentially crucial technique for the
development of the next generation of emotional AI systems, we herein provide a
comprehensive overview of the application of adversarial training to affective
computing and sentiment analysis. Various representative adversarial training
algorithms are explained and discussed accordingly, aimed at tackling diverse
challenges associated with emotional AI systems. Further, we highlight a range
of potential future research directions. We expect that this overview will help
facilitate the development of adversarial training for affective computing and
sentiment analysis in both the academic and industrial communities
Cross-Attention is Not Enough: Incongruity-Aware Multimodal Sentiment Analysis and Emotion Recognition
Fusing multiple modalities for affective computing tasks has proven effective
for performance improvement. However, how multimodal fusion works is not well
understood, and its use in the real world usually results in large model sizes.
In this work, on sentiment and emotion analysis, we first analyze how the
salient affective information in one modality can be affected by the other in
crossmodal attention. We find that inter-modal incongruity exists at the latent
level due to crossmodal attention. Based on this finding, we propose a
lightweight model via Hierarchical Crossmodal Transformer with Modality Gating
(HCT-MG), which determines a primary modality according to its contribution to
the target task and then hierarchically incorporates auxiliary modalities to
alleviate inter-modal incongruity and reduce information redundancy. The
experimental evaluation on three benchmark datasets: CMU-MOSI, CMU-MOSEI, and
IEMOCAP verifies the efficacy of our approach, showing that it: 1) outperforms
major prior work by achieving competitive results and can successfully
recognize hard samples; 2) mitigates the inter-modal incongruity at the latent
level when modalities have mismatched affective tendencies; 3) reduces model
size to less than 1M parameters while outperforming existing models of similar
sizes.Comment: *Equal contributio
Emotion Embeddings \unicode{x2014} Learning Stable and Homogeneous Abstractions from Heterogeneous Affective Datasets
Human emotion is expressed in many communication modalities and media formats
and so their computational study is equally diversified into natural language
processing, audio signal analysis, computer vision, etc. Similarly, the large
variety of representation formats used in previous research to describe
emotions (polarity scales, basic emotion categories, dimensional approaches,
appraisal theory, etc.) have led to an ever proliferating diversity of
datasets, predictive models, and software tools for emotion analysis. Because
of these two distinct types of heterogeneity, at the expressional and
representational level, there is a dire need to unify previous work on
increasingly diverging data and label types. This article presents such a
unifying computational model. We propose a training procedure that learns a
shared latent representation for emotions, so-called emotion embeddings,
independent of different natural languages, communication modalities, media or
representation label formats, and even disparate model architectures.
Experiments on a wide range of heterogeneous affective datasets indicate that
this approach yields the desired interoperability for the sake of reusability,
interpretability and flexibility, without penalizing prediction quality. Code
and data are archived under https://doi.org/10.5281/zenodo.7405327 .Comment: 18 pages, 6 figure
The Many Moods of Emotion
This paper presents a novel approach to the facial expression generation
problem. Building upon the assumption of the psychological community that
emotion is intrinsically continuous, we first design our own continuous emotion
representation with a 3-dimensional latent space issued from a neural network
trained on discrete emotion classification. The so-obtained representation can
be used to annotate large in the wild datasets and later used to trained a
Generative Adversarial Network. We first show that our model is able to map
back to discrete emotion classes with a objectively and subjectively better
quality of the images than usual discrete approaches. But also that we are able
to pave the larger space of possible facial expressions, generating the many
moods of emotion. Moreover, two axis in this space may be found to generate
similar expression changes as in traditional continuous representations such as
arousal-valence. Finally we show from visual interpretation, that the third
remaining dimension is highly related to the well-known dominance dimension
from psychology
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces
We combine multi-task learning and semi-supervised learning by inducing a
joint embedding space between disparate label spaces and learning transfer
functions between label embeddings, enabling us to jointly leverage unlabelled
data and auxiliary, annotated datasets. We evaluate our approach on a variety
of sequence classification tasks with disparate label spaces. We outperform
strong single and multi-task baselines and achieve a new state-of-the-art for
topic-based sentiment analysis.Comment: To appear at NAACL 2018 (long
- …