Search CORE

908 research outputs found

Explaining (Sarcastic) Utterances to Enhance Affect Understanding in Multimodal Dialogues

Author: Akhtar Md Shad
Chakraborty Tanmoy
Kumar Shivani
Mondal Ishani
Publication venue
Publication date: 22/11/2022
Field of study

Conversations emerge as the primary media for exchanging ideas and conceptions. From the listener's perspective, identifying various affective qualities, such as sarcasm, humour, and emotions, is paramount for comprehending the true connotation of the emitted utterance. However, one of the major hurdles faced in learning these affect dimensions is the presence of figurative language, viz. irony, metaphor, or sarcasm. We hypothesize that any detection system constituting the exhaustive and explicit presentation of the emitted utterance would improve the overall comprehension of the dialogue. To this end, we explore the task of Sarcasm Explanation in Dialogues, which aims to unfold the hidden irony behind sarcastic utterances. We propose MOSES, a deep neural network, which takes a multimodal (sarcastic) dialogue instance as an input and generates a natural language sentence as its explanation. Subsequently, we leverage the generated explanation for various natural language understanding tasks in a conversational dialogue setup, such as sarcasm detection, humour identification, and emotion recognition. Our evaluation shows that MOSES outperforms the state-of-the-art system for SED by an average of ~2% on different evaluation metrics, such as ROUGE, BLEU, and METEOR. Further, we observe that leveraging the generated explanation advances three downstream tasks for affect classification - an average improvement of ~14% F1-score in the sarcasm detection task and ~2% in the humour identification and emotion recognition task. We also perform extensive analyses to assess the quality of the results.Comment: Accepted at AAAI 2023. 11 Pages; 14 Tables; 3 Figure

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Computational Sarcasm Analysis on Social Media: A Systematic Review

Author: Hasan Kamrul
Kabir Mohsinul
Kader Faria Binte
Mahmud Hasan
Nujat Nafisa Hossain
Sogir Tasmia Binte
Publication venue
Publication date: 20/09/2022
Field of study

Sarcasm can be defined as saying or writing the opposite of what one truly wants to express, usually to insult, irritate, or amuse someone. Because of the obscure nature of sarcasm in textual data, detecting it is difficult and of great interest to the sentiment analysis research community. Though the research in sarcasm detection spans more than a decade, some significant advancements have been made recently, including employing unsupervised pre-trained transformers in multimodal environments and integrating context to identify sarcasm. In this study, we aim to provide a brief overview of recent advancements and trends in computational sarcasm research for the English language. We describe relevant datasets, methodologies, trends, issues, challenges, and tasks relating to sarcasm that are beyond detection. Our study provides well-summarized tables of sarcasm datasets, sarcastic features and their extraction methods, and performance analysis of various approaches which can help researchers in related domains understand current state-of-the-art practices in sarcasm detection.Comment: 50 pages, 3 tables, Submitted to 'Data Mining and Knowledge Discovery' for possible publicatio

arXiv.org e-Print Archive

Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation

Author: Jia Mengzhao
Jing Liqiang
Nie Liqiang
Ouyang Kun
Song Xuemeng
Publication venue
Publication date: 28/06/2023
Field of study

Multimodal Sarcasm Explanation (MuSE) is a new yet challenging task, which aims to generate a natural language sentence for a multimodal social post (an image as well as its caption) to explain why it contains sarcasm. Although the existing pioneer study has achieved great success with the BART backbone, it overlooks the gap between the visual feature space and the decoder semantic space, the object-level metadata of the image, as well as the potential external knowledge. To solve these limitations, in this work, we propose a novel mulTi-source sEmantic grAph-based Multimodal sarcasm explanation scheme, named TEAM. In particular, TEAM extracts the object-level semantic meta-data instead of the traditional global visual features from the input image. Meanwhile, TEAM resorts to ConceptNet to obtain the external related knowledge concepts for the input text and the extracted object meta-data. Thereafter, TEAM introduces a multi-source semantic graph that comprehensively characterize the multi-source (i.e., caption, object meta-data, external knowledge) semantic relations to facilitate the sarcasm reasoning. Extensive experiments on a public released dataset MORE verify the superiority of our model over cutting-edge methods.Comment: Accepted by ACL 2023 main conferenc

arXiv.org e-Print Archive

TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models

Author: Hasan Md Kamrul
Hoque Ehsan
Islam Md Saiful
Khan Mohammed Ibrahim
Lee Sangwu
Naim Iftekhar
Rahman Wasifur
Publication venue
Publication date: 29/03/2023
Field of study

Pre-trained large language models have recently achieved ground-breaking performance in a wide variety of language understanding tasks. However, the same model can not be applied to multimodal behavior understanding tasks (e.g., video sentiment/humor detection) unless non-verbal features (e.g., acoustic and visual) can be integrated with language. Jointly modeling multiple modalities significantly increases the model complexity, and makes the training process data-hungry. While an enormous amount of text data is available via the web, collecting large-scale multimodal behavioral video datasets is extremely expensive, both in terms of time and money. In this paper, we investigate whether large language models alone can successfully incorporate non-verbal information when they are presented in textual form. We present a way to convert the acoustic and visual information into corresponding textual descriptions and concatenate them with the spoken text. We feed this augmented input to a pre-trained BERT model and fine-tune it on three downstream multimodal tasks: sentiment, humor, and sarcasm detection. Our approach, TextMI, significantly reduces model complexity, adds interpretability to the model's decision, and can be applied for a diverse set of tasks while achieving superior (multimodal sarcasm detection) or near SOTA (multimodal sentiment analysis and multimodal humor detection) performance. We propose TextMI as a general, competitive baseline for multimodal behavioral analysis tasks, particularly in a low-resource setting

arXiv.org e-Print Archive

A Multimodal Approach to Sarcasm Detection on Social Media

Author: Das Dipto
Publication venue: BearWorks
Publication date: 01/08/2019
Field of study

In recent times, a major share of human communication takes place online. The main reason being the ease of communication on social networking sites (SNSs). Due to the variety and large number of users, SNSs have drawn the attention of the computer science (CS) community, particularly the affective computing (also known as emotional AI), information retrieval, natural language processing, and data mining groups. Researchers are trying to make computers understand the nuances of human communication including sentiment and sarcasm. Emotion or sentiment detection requires more insights about the communication than it does for factual information retrieval. Sarcasm detection is particularly more difficult than categorizing sentiment. Because, in sarcasm, the intended meaning of the expression by the user is opposite to the literal meaning. Because of its complex nature, it is often difficult even for human to detect sarcasm without proper context. However, people on social media succeed in detecting sarcasm despite interacting with strangers across the world. That motivates us to investigate the human process of detecting sarcasm on social media where abundant context information is often unavailable and the group of users communicating with each other are rarely well-acquainted. We have conducted a qualitative study to examine the patterns of users conveying sarcasm on social media. Whereas most sarcasm detection systems deal in word-by-word basis to accomplish their goal, we focused on the holistic sentiment conveyed by the post. We argue that utilization of word-level information will limit the systems performance to the domain of the dataset used to train the system and might not perform well for non-English language. As an endeavor to make our system less dependent on text data, we proposed a multimodal approach for sarcasm detection. We showed the applicability of images and reaction emoticons as other sources of hints about the sentiment of the post. Our research showed the superior results from a multimodal approach when compared to a unimodal approach. Multimodal sarcasm detection systems, as the one presented in this research, with the inclusion of more modes or sources of data might lead to a better sarcasm detection model

Missouri State University: BearWorks

Argumentation by figurative language in verbal communication: a pragmatic perspective

Author: Dae-Young Kim
Publication venue
Publication date: 17/05/2013
Field of study

This thesis has two goals. The first is to explain, within a pragmatic perspective, how figurative language (i.e. metaphor and irony) performs argumentation. Based on the argumentation theory (AT) of Perelman and Olbrecht-Tyteca (1958), argumentation is defined as the process of justifying something in an organized or logical way, which is composed of one or more claims and shows one or more grounds for maintaining them. The second goal is to examine the hearer’s interpretation of figurative utterances in argumentation. The theoretical foundation of this discussion is based on experientialist epistemology (i.e. experientialism) and cognitive pragmatics in the form of Relevance Theory (RT). In pursuit of those goals, I present four main innovations: First, I argue the status of metaphor should be viewed as ‘what is implicated’, rather than ‘what is said’. Second, I propose explanation of some exceptional cases of irony, which the standard RT approach does not treat, which relies on the notion of ‘incongruity’. Third, I propose integration of AT concepts within RT. Thus, this approach contributes to pursuing more economical explanation of communication as argumentation, by a single principle of relevance, but incorporating argumentative concepts such as doxa, topoi and polyphony. Finally, I apply this integrated approach to analysing real cases of commercial advertisement by metaphor or irony, or both. This includes explaining connection and overlapping, two ways in which metaphor and irony can work together

Sussex Research Online

Computational sarcasm detection and understanding in online communication

Author: Oprea Silviu Vlad
Publication venue: The University of Edinburgh
Publication date: 25/04/2023
Field of study

The presence of sarcasm in online communication has motivated an increasing number of computational investigations of sarcasm across the scientific community. In this thesis, we build upon these investigations. Pointing out their limitations, we bring four contributions that span two research directions: sarcasm detection and sarcasm understanding. Sarcasm detection is the task of building computational models optimised for recognising sarcasm in a given text. These models are often built in a supervised learning paradigm, relying on datasets of texts labelled for sarcasm. We bring two contributions in this direction. First, we question the effectiveness of previous methods used to label texts for sarcasm. We argue that the labels they produce might not coincide with the sarcastic intention of the authors of the texts that they are labelling. In response, we suggest a new method, and we use it to build iSarcasm, a novel dataset of sarcastic and non-sarcastic tweets. We show that previous models achieve considerably lower performance on iSarcasm than on previous datasets, while human annotators achieve a considerably higher performance, compared to models, pointing out the need for more effective models. Therefore, as a second contribution, we organise a competition that invites the community to create such models. Sarcasm understanding is the task of explicating the phenomena that are subsumed under the umbrella of sarcasm through computational investigation. We bring two contributions in this direction. First, we conduct an alaysis into the socio-demographic ecology of sarcastic exchanges between human interlocutors. We find that the effectiveness of such exchanges is influenced by the socio-demographic similarity between the interlocutors, with factors such as English language nativeness, age, and gender, being particualry influential. We suggest that future social analysis tools should account for these factors. Second, we challenge the motivation of a recent endeavour of the community; mainly, that of augmenting dialogue systems with the ability to generate sarcastic responses. Through a series of social experiments, we provide guidelines for dialogue systems concerning the appropriateness of generating sarcastic responses, and the formulation of such responses. Through our work, we aim to encourage the community to consider computational investigations of sarcasm interdisciplinarily, at the intersection of natural language processing and computational social science

Edinburgh Research Archive

Automated Moderation: Detecting Irony in a Norwegian Facebook Comment Section using a Longformer Transformer Model with a Context Encoded Dataset

Author: Hatlebakk Torstein
Publication venue: The University of Bergen
Publication date: 01/01/2022
Field of study

Irony is a complex phenomenon of human communication and due to its contextual nature has been notoriously difficult for machine learning algorithms to detect. With an established practical definition of irony based in the environment of Facebook comment sections. Used together with a Norwegian language pre-trained BERT model converted to a long version that supports longer text inputs, and a Norwegian Facebook comment dataset with contextual article and reply comment text included. It was found that the long BERT model trained on the context included inputs dataset outperformed the short BERT models trained on datasets of the same and more comments, but without the contextual information encoded.Master's Thesis in Information ScienceINFO390MASV-INF

University of Bergen

NORA - Norwegian Open Research Archives