144 research outputs found
Distribution-based Emotion Recognition in Conversation
Automatic emotion recognition in conversation (ERC) is crucial for
emotion-aware conversational artificial intelligence. This paper proposes a
distribution-based framework that formulates ERC as a sequence-to-sequence
problem for emotion distribution estimation. The inherent ambiguity of emotions
and the subjectivity of human perception lead to disagreements in emotion
labels, which is handled naturally in our framework from the perspective of
uncertainty estimation in emotion distributions. A Bayesian training loss is
introduced to improve the uncertainty estimation by conditioning each emotional
state on an utterance-specific Dirichlet prior distribution. Experimental
results on the IEMOCAP dataset show that ERC outperformed the
single-utterance-based system, and the proposed distribution-based ERC methods
have not only better classification accuracy, but also show improved
uncertainty estimation.Comment: To appear in SLT 202
Emotion Recognition in Conversation using Probabilistic Soft Logic
Creating agents that can both appropriately respond to conversations and
understand complex human linguistic tendencies and social cues has been a long
standing challenge in the NLP community. A recent pillar of research revolves
around emotion recognition in conversation (ERC); a sub-field of emotion
recognition that focuses on conversations or dialogues that contain two or more
utterances. In this work, we explore an approach to ERC that exploits the use
of neural embeddings along with complex structures in dialogues. We implement
our approach in a framework called Probabilistic Soft Logic (PSL), a
declarative templating language that uses first-order like logical rules, that
when combined with data, define a particular class of graphical model.
Additionally, PSL provides functionality for the incorporation of results from
neural models into PSL models. This allows our model to take advantage of
advanced neural methods, such as sentence embeddings, and logical reasoning
over the structure of a dialogue. We compare our method with state-of-the-art
purely neural ERC systems, and see almost a 20% improvement. With these
results, we provide an extensive qualitative and quantitative analysis over the
DailyDialog conversation dataset
InstructERC: Reforming Emotion Recognition in Conversation with a Retrieval Multi-task LLMs Framework
The development of emotion recognition in dialogue (ERC) has been
consistently hindered by the complexity of pipeline designs, leading to ERC
models that often overfit to specific datasets and dialogue patterns. In this
study, we propose a novel approach, namely
InstructERC, to reformulates the ERC task from a discriminative framework to
a generative framework based on Large Language Models (LLMs) . InstructERC has
two significant contributions: Firstly, InstructERC introduces a simple yet
effective retrieval template module, which helps the model explicitly integrate
multi-granularity dialogue supervision information by concatenating the
historical dialog content, label statement, and emotional domain demonstrations
with high semantic similarity. Furthermore, we introduce two additional emotion
alignment tasks, namely speaker identification and emotion prediction tasks, to
implicitly model the dialogue role relationships and future emotional
tendencies in conversations. Our LLM-based plug-and-play plugin framework
significantly outperforms all previous models and achieves comprehensive SOTA
on three commonly used ERC datasets. Extensive analysis of parameter-efficient
and data-scaling experiments provide empirical guidance for applying
InstructERC in practical scenarios. Our code will be released after blind
review
Utilizing External Knowledge to Enhance Semantics in Emotion Detection
Enabling machines to emotion recognition in conversation is challenging, mainly because the information in human dialogue innately conveys emotions by long-term experience, abundant knowledge, context, and the intricate patterns between the affective states. We address the task of emotion recognition in conversations using external knowledge to enhance semantics. We propose KES model, a new framework that incorporates different elements of external knowledge and conversational semantic role labeling, where build upon them to learn interactions between interlocutors participating in a conversation. We design a self-attention layer specialized for enhanced semantic text features with external commonsense knowledge. Then, two different networks composed of LSTM are responsible for tracking individual internal state and context external state. In addition, the proposed model has experimented on three datasets in emotion detection in conversation. The experimental results show that our model outperforms the state-of-the-art approaches on most of the tested datasets
Efficient Cross-Task Prompt Tuning for Few-Shot Conversational Emotion Recognition
Emotion Recognition in Conversation (ERC) has been widely studied due to its
importance in developing emotion-aware empathetic machines. The rise of
pre-trained language models (PLMs) has further pushed the limit of ERC
performance. However, most recent works on ERC using PLMs are heavily
data-driven, and requires fine-tuning the entire PLMs. To improve both sample
and computational efficiency, we propose a derivative-free optimization method
called Cross-Task Prompt Tuning (CTPT) for few-shot conversational emotion
recognition. Unlike existing methods that learn independent knowledge from
individual tasks, CTPT leverages sharable cross-task knowledge by exploiting
external knowledge from other source tasks to improve learning performance
under the few-shot setting. Moreover, CTPT only needs to optimize a vector
under the low intrinsic dimensionality without gradient, which is highly
parameter-efficient compared with existing approaches. Experiments on five
different contextual conversation datasets demonstrate that our CTPT method has
superior results on both few-shot scenarios and zero-shot transfers.Comment: Findings of EMNLP 202
UniSA: Unified Generative Framework for Sentiment Analysis
Sentiment analysis is a crucial task that aims to understand people's
emotional states and predict emotional categories based on multimodal
information. It consists of several subtasks, such as emotion recognition in
conversation (ERC), aspect-based sentiment analysis (ABSA), and multimodal
sentiment analysis (MSA). However, unifying all subtasks in sentiment analysis
presents numerous challenges, including modality alignment, unified
input/output forms, and dataset bias. To address these challenges, we propose a
Task-Specific Prompt method to jointly model subtasks and introduce a
multimodal generative framework called UniSA. Additionally, we organize the
benchmark datasets of main subtasks into a new Sentiment Analysis Evaluation
benchmark, SAEval. We design novel pre-training tasks and training methods to
enable the model to learn generic sentiment knowledge among subtasks to improve
the model's multimodal sentiment perception ability. Our experimental results
show that UniSA performs comparably to the state-of-the-art on all subtasks and
generalizes well to various subtasks in sentiment analysis.Comment: Accepted to ACM MM 202
UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition
Multimodal sentiment analysis (MSA) and emotion recognition in conversation
(ERC) are key research topics for computers to understand human behaviors. From
a psychological perspective, emotions are the expression of affect or feelings
during a short period, while sentiments are formed and held for a longer
period. However, most existing works study sentiment and emotion separately and
do not fully exploit the complementary knowledge behind the two. In this paper,
we propose a multimodal sentiment knowledge-sharing framework (UniMSE) that
unifies MSA and ERC tasks from features, labels, and models. We perform
modality fusion at the syntactic and semantic levels and introduce contrastive
learning between modalities and samples to better capture the difference and
consistency between sentiments and emotions. Experiments on four public
benchmark datasets, MOSI, MOSEI, MELD, and IEMOCAP, demonstrate the
effectiveness of the proposed method and achieve consistent improvements
compared with state-of-the-art methods.Comment: Accepted to EMNLP 2022 main conferenc
- …