210 research outputs found
Detecting Sarcasm in Multimodal Social Platforms
Sarcasm is a peculiar form of sentiment expression, where the surface
sentiment differs from the implied sentiment. The detection of sarcasm in
social media platforms has been applied in the past mainly to textual
utterances where lexical indicators (such as interjections and intensifiers),
linguistic markers, and contextual information (such as user profiles, or past
conversations) were used to detect the sarcastic tone. However, modern social
media platforms allow to create multimodal messages where audiovisual content
is integrated with the text, making the analysis of a mode in isolation
partial. In our work, we first study the relationship between the textual and
visual aspects in multimodal posts from three major social media platforms,
i.e., Instagram, Tumblr and Twitter, and we run a crowdsourcing task to
quantify the extent to which images are perceived as necessary by human
annotators. Moreover, we propose two different computational frameworks to
detect sarcasm that integrate the textual and visual modalities. The first
approach exploits visual semantics trained on an external dataset, and
concatenates the semantics features with state-of-the-art textual features. The
second method adapts a visual neural network initialized with parameters
trained on ImageNet to multimodal sarcastic posts. Results show the positive
effect of combining modalities for the detection of sarcasm across platforms
and methods.Comment: 10 pages, 3 figures, final version published in the Proceedings of
ACM Multimedia 201
Building a Sentiment Corpus of Tweets in Brazilian Portuguese
The large amount of data available in social media, forums and websites
motivates researches in several areas of Natural Language Processing, such as
sentiment analysis. The popularity of the area due to its subjective and
semantic characteristics motivates research on novel methods and approaches for
classification. Hence, there is a high demand for datasets on different domains
and different languages. This paper introduces TweetSentBR, a sentiment corpora
for Brazilian Portuguese manually annotated with 15.000 sentences on TV show
domain. The sentences were labeled in three classes (positive, neutral and
negative) by seven annotators, following literature guidelines for ensuring
reliability on the annotation. We also ran baseline experiments on polarity
classification using three machine learning methods, reaching 80.99% on
F-Measure and 82.06% on accuracy in binary classification, and 59.85% F-Measure
and 64.62% on accuracy on three point classification.Comment: Accepted for publication in 11th International Conference on Language
Resources and Evaluation (LREC 2018
Unlocking the Pragmatics of Emoji: Evaluation of the Integration of Pragmatic Markers for Sarcasm Detection
Emojis have become an integral element of online communications, serving as a powerful, under-utilised resource for enhancing pragmatic understanding in NLP. Previous works have highlighted their potential for improvement of more complex tasks such as the identification of figurative literary devices including sarcasm due to their role in conveying tone within text. However present state-of-the-art does not include the consideration of emoji or adequately address sarcastic markers such as sentiment incongruence. This work aims to integrate these concepts to generate more robust solutions for sarcasm detection leveraging enhanced pragmatic features from both emoji and text tokens. This was achieved by establishing methodologies for sentiment feature extraction from emojis and a depth statistical evaluation of the features which characterise sarcastic text on Twitter. Current convention for generation of training data which implements weak-labelling using hashtags or keywords was evaluated against a human-annotated baseline; postulated validity concerns were verified where statistical evaluation found the content features deviated significantly from the baseline, highlighting potential validity concerns for many prominent works on the topic to date. Organic labelled sarcastic tweets containing emojis were crowd sourced by means of a survey to ensure valid outcomes for the sarcasm detection model. Given an established importance of both semantic and sentiment information, a novel sentiment-aware attention mechanism was constructed to enhance pattern recognition, balancing core features of sarcastic text: sentiment incongruence and context. This work establishes a framework for emoji feature extraction; a key roadblock cited in literature for their use in NLP tasks. The proposed sarcasm detection pipeline successfully facilitates the task using a GRU neural network with sentiment-aware attention, at an accuracy of 73% and promising indications regarding model robustness as part of a framework which is easily scalable for the inclusion of any future emojis released. Both enhanced sentiment information to supplement context in addition to consideration of the emoji were found to improve outcomes for the task
SentimentGPT: Exploiting GPT for Advanced Sentiment Analysis and its Departure from Current Machine Learning
This study presents a thorough examination of various Generative Pretrained
Transformer (GPT) methodologies in sentiment analysis, specifically in the
context of Task 4 on the SemEval 2017 dataset. Three primary strategies are
employed: 1) prompt engineering using the advanced GPT-3.5 Turbo, 2)
fine-tuning GPT models, and 3) an inventive approach to embedding
classification. The research yields detailed comparative insights among these
strategies and individual GPT models, revealing their unique strengths and
potential limitations. Additionally, the study compares these GPT-based
methodologies with other current, high-performing models previously used with
the same dataset. The results illustrate the significant superiority of the GPT
approaches in terms of predictive performance, more than 22\% in F1-score
compared to the state-of-the-art. Further, the paper sheds light on common
challenges in sentiment analysis tasks, such as understanding context and
detecting sarcasm. It underscores the enhanced capabilities of the GPT models
to effectively handle these complexities. Taken together, these findings
highlight the promising potential of GPT models in sentiment analysis, setting
the stage for future research in this field. The code can be found at
https://github.com/DSAatUSU/SentimentGP
Building Towards Automated Cyberbullying Detection: A Comparative Analysis
The increased use of social media between digitally anonymous users, sharing their thoughts and opinions, can facilitate participation and collaboration. However, it’s this anonymity feature which gives users freedom of speech and allows them to conduct activities without being judged by others can also encourage cyberbullying and hate speech. Predators can hide their identity and reach a wide range of audience anytime and anywhere. According to the detrimental effect of cyberbullying, there is a growing need for cyberbullying detection approaches. In this survey paper, a comparative analysis of the automated cyberbullying techniques from different perspectives is discussed including data annotation, data pre-processing and feature engineering. In addition, the importance of emojis in expressing emotions as well as their influence on sentiment classification and text comprehension lead us to discuss the role of incorporating emojis in the process of cyberbullying detection and their influence on the detection performance. Furthermore, the different domains for using Self-Supervised Learning (SSL) as an annotation technique for cyberbullying detection is explored
- …