3,122 research outputs found
Extending the EmotiNet Knowledge Base to Improve the Automatic Detection of Implicitly Expressed Emotions from Text
Sentiment analysis is one of the recent, highly dynamic fields in Natural
Language Processing. Most existing approaches are based on word-level
analysis of texts and are mostly able to detect only explicit expressions of
sentiment. However, in many cases, emotions are not expressed by using
words with an affective meaning (e.g. happy), but by describing real-life
situations, which readers (based on their commonsense knowledge) detect
as being related to a specic emotion. Given the challenges of detecting
emotions from contexts in which no lexical clue is present, in this article we
present a comparative analysis between the performance of well-established
methods for emotion detection (supervised and lexical knowledge-based) and
a method we propose and extend, which is based on commonsense knowledge
stored in the EmotiNet knowledge base. Our extensive evaluations show
that, in the context of this task, the approach based on EmotiNet is the
most appropriate.JRC.G.2-Global security and crisis managemen
Exploiting Pseudo Future Contexts for Emotion Recognition in Conversations
With the extensive accumulation of conversational data on the Internet,
emotion recognition in conversations (ERC) has received increasing attention.
Previous efforts of this task mainly focus on leveraging contextual and
speaker-specific features, or integrating heterogeneous external commonsense
knowledge. Among them, some heavily rely on future contexts, which, however,
are not always available in real-life scenarios. This fact inspires us to
generate pseudo future contexts to improve ERC. Specifically, for an utterance,
we generate its future context with pre-trained language models, potentially
containing extra beneficial knowledge in a conversational form homogeneous with
the historical ones. These characteristics make pseudo future contexts easily
fused with historical contexts and historical speaker-specific contexts,
yielding a conceptually simple framework systematically integrating
multi-contexts. Experimental results on four ERC datasets demonstrate our
method's superiority. Further in-depth analyses reveal that pseudo future
contexts can rival real ones to some extent, especially in relatively
context-independent conversations.Comment: 15 pages, accepted by ADMA 202
IEST: WASSA-2018 Implicit Emotions Shared Task
Past shared tasks on emotions use data with both overt expressions of
emotions (I am so happy to see you!) as well as subtle expressions where the
emotions have to be inferred, for instance from event descriptions. Further,
most datasets do not focus on the cause or the stimulus of the emotion. Here,
for the first time, we propose a shared task where systems have to predict the
emotions in a large automatically labeled dataset of tweets without access to
words denoting emotions. Based on this intention, we call this the Implicit
Emotion Shared Task (IEST) because the systems have to infer the emotion mostly
from the context. Every tweet has an occurrence of an explicit emotion word
that is masked. The tweets are collected in a manner such that they are likely
to include a description of the cause of the emotion - the stimulus.
Altogether, 30 teams submitted results which range from macro F1 scores of 21 %
to 71 %. The baseline (MaxEnt bag of words and bigrams) obtains an F1 score of
60 % which was available to the participants during the development phase. A
study with human annotators suggests that automatic methods outperform human
predictions, possibly by honing into subtle textual clues not used by humans.
Corpora, resources, and results are available at the shared task website at
http://implicitemotions.wassa2018.com.Comment: Accepted at Proceedings of the 9th Workshop on Computational
Approaches to Subjectivity, Sentiment and Social Media Analysi
Text-image synergy for multimodal retrieval and annotation
Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text und Bild sind die beiden häufigsten Arten von Inhalten im Internet. Während es für Menschen einfach ist, gerade aus dem Zusammenspiel von Text- und Bildinhalten Informationen zu erfassen, stellt diese kombinierte Darstellung von Inhalten Softwaresysteme vor große Herausforderungen. In dieser Dissertation werden Probleme studiert, für deren Lösung das Verständnis des Zusammenspiels von Text- und Bildinhalten wesentlich ist. Es werden Methoden und Vorschläge präsentiert und empirisch bewertet, die semantische Verbindungen zwischen Text und Bild in multimodalen Daten herstellen. Wir stellen in dieser Dissertation vier miteinander verbundene Text- und Bildprobleme vor: • Bildersuche. Ob Bilder anhand von textbasierten Suchanfragen gefunden werden, hängt stark davon ab, ob der Text in der Nähe des Bildes mit dem der Anfrage übereinstimmt. Bilder ohne textuellen Kontext, oder sogar mit thematisch passendem Kontext, aber ohne direkte Übereinstimmungen der vorhandenen Schlagworte zur Suchanfrage, können häufig nicht gefunden werden. Zur Abhilfe schlagen wir vor, drei Arten von Informationen in Kombination zu nutzen: visuelle Informationen (in Form von automatisch generierten Bildbeschreibungen), textuelle Informationen (Stichworte aus vorangegangenen Suchanfragen), und Alltagswissen. • Verbesserte Bildbeschreibungen. Bei der Objekterkennung durch Computer Vision kommt es des Öfteren zu Fehldetektionen und Inkohärenzen. Die korrekte Identifikation von Bildinhalten ist jedoch eine wichtige Voraussetzung für die Suche nach Bildern mittels textueller Suchanfragen. Um die Fehleranfälligkeit bei der Objekterkennung zu minimieren, schlagen wir vor Alltagswissen einzubeziehen. Durch zusätzliche Bild-Annotationen, welche sich durch den gesunden Menschenverstand als thematisch passend erweisen, können viele fehlerhafte und zusammenhanglose Erkennungen vermieden werden. • Bild-Text Platzierung. Auf Internetseiten mit Text- und Bildinhalten (wie Nachrichtenseiten, Blogbeiträge, Artikel in sozialen Medien) werden Bilder in der Regel an semantisch sinnvollen Positionen im Textfluss platziert. Wir nutzen dies um ein Framework vorzuschlagen, in dem relevante Bilder ausgesucht werden und mit den passenden Abschnitten eines Textes assoziiert werden. • Bildunterschriften. Bilder, die als Teil von multimodalen Inhalten zur Verbesserung der Lesbarkeit von Texten dienen, haben typischerweise Bildunterschriften, die zum Kontext des umgebenden Texts passen. Wir schlagen vor, den Kontext beim automatischen Generieren von Bildunterschriften ebenfalls einzubeziehen. Üblicherweise werden hierfür die Bilder allein analysiert. Wir stellen die kontextbezogene Bildunterschriftengenerierung vor. Unsere vielversprechenden Beobachtungen und Ergebnisse eröffnen interessante Möglichkeiten für weitergehende Forschung zur computergestützten Erfassung des Zusammenspiels von Text- und Bildinhalten
Disentangled Variational Autoencoder for Emotion Recognition in Conversations
In Emotion Recognition in Conversations (ERC), the emotions of target
utterances are closely dependent on their context. Therefore, existing works
train the model to generate the response of the target utterance, which aims to
recognise emotions leveraging contextual information. However, adjacent
response generation ignores long-range dependencies and provides limited
affective information in many cases. In addition, most ERC models learn a
unified distributed representation for each utterance, which lacks
interpretability and robustness. To address these issues, we propose a
VAD-disentangled Variational AutoEncoder (VAD-VAE), which first introduces a
target utterance reconstruction task based on Variational Autoencoder, then
disentangles three affect representations Valence-Arousal-Dominance (VAD) from
the latent space. We also enhance the disentangled representations by
introducing VAD supervision signals from a sentiment lexicon and minimising the
mutual information between VAD distributions. Experiments show that VAD-VAE
outperforms the state-of-the-art model on two datasets. Further analysis proves
the effectiveness of each proposed module and the quality of disentangled VAD
representations. The code is available at
https://github.com/SteveKGYang/VAD-VAE.Comment: Accepted by IEEE Transactions on Affective Computin
Through the Lens of Core Competency: Survey on Evaluation of Large Language Models
From pre-trained language model (PLM) to large language model (LLM), the
field of natural language processing (NLP) has witnessed steep performance
gains and wide practical uses. The evaluation of a research field guides its
direction of improvement. However, LLMs are extremely hard to thoroughly
evaluate for two reasons. First of all, traditional NLP tasks become inadequate
due to the excellent performance of LLM. Secondly, existing evaluation tasks
are difficult to keep up with the wide range of applications in real-world
scenarios. To tackle these problems, existing works proposed various benchmarks
to better evaluate LLMs. To clarify the numerous evaluation tasks in both
academia and industry, we investigate multiple papers concerning LLM
evaluations. We summarize 4 core competencies of LLM, including reasoning,
knowledge, reliability, and safety. For every competency, we introduce its
definition, corresponding benchmarks, and metrics. Under this competency
architecture, similar tasks are combined to reflect corresponding ability,
while new tasks can also be easily added into the system. Finally, we give our
suggestions on the future direction of LLM's evaluation
A Survey on Semantic Processing Techniques
Semantic processing is a fundamental research domain in computational
linguistics. In the era of powerful pre-trained language models and large
language models, the advancement of research in this domain appears to be
decelerating. However, the study of semantics is multi-dimensional in
linguistics. The research depth and breadth of computational semantic
processing can be largely improved with new technologies. In this survey, we
analyzed five semantic processing tasks, e.g., word sense disambiguation,
anaphora resolution, named entity recognition, concept extraction, and
subjectivity detection. We study relevant theoretical research in these fields,
advanced methods, and downstream applications. We connect the surveyed tasks
with downstream applications because this may inspire future scholars to fuse
these low-level semantic processing tasks with high-level natural language
processing tasks. The review of theoretical research may also inspire new tasks
and technologies in the semantic processing domain. Finally, we compare the
different semantic processing techniques and summarize their technical trends,
application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN
1566-2535. The equal contribution mark is missed in the published version due
to the publication policies. Please contact Prof. Erik Cambria for detail
- …