210 research outputs found

    Detecting Sarcasm in Multimodal Social Platforms

    Full text link
    Sarcasm is a peculiar form of sentiment expression, where the surface sentiment differs from the implied sentiment. The detection of sarcasm in social media platforms has been applied in the past mainly to textual utterances where lexical indicators (such as interjections and intensifiers), linguistic markers, and contextual information (such as user profiles, or past conversations) were used to detect the sarcastic tone. However, modern social media platforms allow to create multimodal messages where audiovisual content is integrated with the text, making the analysis of a mode in isolation partial. In our work, we first study the relationship between the textual and visual aspects in multimodal posts from three major social media platforms, i.e., Instagram, Tumblr and Twitter, and we run a crowdsourcing task to quantify the extent to which images are perceived as necessary by human annotators. Moreover, we propose two different computational frameworks to detect sarcasm that integrate the textual and visual modalities. The first approach exploits visual semantics trained on an external dataset, and concatenates the semantics features with state-of-the-art textual features. The second method adapts a visual neural network initialized with parameters trained on ImageNet to multimodal sarcastic posts. Results show the positive effect of combining modalities for the detection of sarcasm across platforms and methods.Comment: 10 pages, 3 figures, final version published in the Proceedings of ACM Multimedia 201

    Building a Sentiment Corpus of Tweets in Brazilian Portuguese

    Full text link
    The large amount of data available in social media, forums and websites motivates researches in several areas of Natural Language Processing, such as sentiment analysis. The popularity of the area due to its subjective and semantic characteristics motivates research on novel methods and approaches for classification. Hence, there is a high demand for datasets on different domains and different languages. This paper introduces TweetSentBR, a sentiment corpora for Brazilian Portuguese manually annotated with 15.000 sentences on TV show domain. The sentences were labeled in three classes (positive, neutral and negative) by seven annotators, following literature guidelines for ensuring reliability on the annotation. We also ran baseline experiments on polarity classification using three machine learning methods, reaching 80.99% on F-Measure and 82.06% on accuracy in binary classification, and 59.85% F-Measure and 64.62% on accuracy on three point classification.Comment: Accepted for publication in 11th International Conference on Language Resources and Evaluation (LREC 2018

    Unlocking the Pragmatics of Emoji: Evaluation of the Integration of Pragmatic Markers for Sarcasm Detection

    Get PDF
    Emojis have become an integral element of online communications, serving as a powerful, under-utilised resource for enhancing pragmatic understanding in NLP. Previous works have highlighted their potential for improvement of more complex tasks such as the identification of figurative literary devices including sarcasm due to their role in conveying tone within text. However present state-of-the-art does not include the consideration of emoji or adequately address sarcastic markers such as sentiment incongruence. This work aims to integrate these concepts to generate more robust solutions for sarcasm detection leveraging enhanced pragmatic features from both emoji and text tokens. This was achieved by establishing methodologies for sentiment feature extraction from emojis and a depth statistical evaluation of the features which characterise sarcastic text on Twitter. Current convention for generation of training data which implements weak-labelling using hashtags or keywords was evaluated against a human-annotated baseline; postulated validity concerns were verified where statistical evaluation found the content features deviated significantly from the baseline, highlighting potential validity concerns for many prominent works on the topic to date. Organic labelled sarcastic tweets containing emojis were crowd sourced by means of a survey to ensure valid outcomes for the sarcasm detection model. Given an established importance of both semantic and sentiment information, a novel sentiment-aware attention mechanism was constructed to enhance pattern recognition, balancing core features of sarcastic text: sentiment incongruence and context. This work establishes a framework for emoji feature extraction; a key roadblock cited in literature for their use in NLP tasks. The proposed sarcasm detection pipeline successfully facilitates the task using a GRU neural network with sentiment-aware attention, at an accuracy of 73% and promising indications regarding model robustness as part of a framework which is easily scalable for the inclusion of any future emojis released. Both enhanced sentiment information to supplement context in addition to consideration of the emoji were found to improve outcomes for the task

    SentimentGPT: Exploiting GPT for Advanced Sentiment Analysis and its Departure from Current Machine Learning

    Full text link
    This study presents a thorough examination of various Generative Pretrained Transformer (GPT) methodologies in sentiment analysis, specifically in the context of Task 4 on the SemEval 2017 dataset. Three primary strategies are employed: 1) prompt engineering using the advanced GPT-3.5 Turbo, 2) fine-tuning GPT models, and 3) an inventive approach to embedding classification. The research yields detailed comparative insights among these strategies and individual GPT models, revealing their unique strengths and potential limitations. Additionally, the study compares these GPT-based methodologies with other current, high-performing models previously used with the same dataset. The results illustrate the significant superiority of the GPT approaches in terms of predictive performance, more than 22\% in F1-score compared to the state-of-the-art. Further, the paper sheds light on common challenges in sentiment analysis tasks, such as understanding context and detecting sarcasm. It underscores the enhanced capabilities of the GPT models to effectively handle these complexities. Taken together, these findings highlight the promising potential of GPT models in sentiment analysis, setting the stage for future research in this field. The code can be found at https://github.com/DSAatUSU/SentimentGP

    Building Towards Automated Cyberbullying Detection: A Comparative Analysis

    Get PDF
    The increased use of social media between digitally anonymous users, sharing their thoughts and opinions, can facilitate participation and collaboration. However, it’s this anonymity feature which gives users freedom of speech and allows them to conduct activities without being judged by others can also encourage cyberbullying and hate speech. Predators can hide their identity and reach a wide range of audience anytime and anywhere. According to the detrimental effect of cyberbullying, there is a growing need for cyberbullying detection approaches. In this survey paper, a comparative analysis of the automated cyberbullying techniques from different perspectives is discussed including data annotation, data pre-processing and feature engineering. In addition, the importance of emojis in expressing emotions as well as their influence on sentiment classification and text comprehension lead us to discuss the role of incorporating emojis in the process of cyberbullying detection and their influence on the detection performance. Furthermore, the different domains for using Self-Supervised Learning (SSL) as an annotation technique for cyberbullying detection is explored
    • …
    corecore