635 research outputs found

    LT3: sentiment analysis of figurative tweets: piece of cake #NotReally

    Get PDF
    This paper describes our contribution to the SemEval-2015 Task 11 on sentiment analysis of figurative language in Twitter. We considered two approaches, classification and regression, to provide fine-grained sentiment scores for a set of tweets that are rich in sarcasm, irony and metaphor. To this end, we combined a variety of standard lexical and syntactic features with specific features for capturing figurative content. All experiments were done using supervised learning with LIBSVM. For both runs, our system ranked fourth among fifteen submissions

    Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features

    Full text link
    Satirical news is considered to be entertainment, but it is potentially deceptive and harmful. Despite the embedded genre in the article, not everyone can recognize the satirical cues and therefore believe the news as true news. We observe that satirical cues are often reflected in certain paragraphs rather than the whole document. Existing works only consider document-level features to detect the satire, which could be limited. We consider paragraph-level linguistic features to unveil the satire by incorporating neural network and attention mechanism. We investigate the difference between paragraph-level features and document-level features, and analyze them on a large satirical news dataset. The evaluation shows that the proposed model detects satirical news effectively and reveals what features are important at which level.Comment: EMNLP 2017, 11 page

    A Crowd-Annotated Spanish Corpus for Humor Analysis

    Full text link
    Computational Humor involves several tasks, such as humor recognition, humor generation, and humor scoring, for which it is useful to have human-curated data. In this work we present a corpus of 27,000 tweets written in Spanish and crowd-annotated by their humor value and funniness score, with about four annotations per tweet, tagged by 1,300 people over the Internet. It is equally divided between tweets coming from humorous and non-humorous accounts. The inter-annotator agreement Krippendorff's alpha value is 0.5710. The dataset is available for general use and can serve as a basis for humor detection and as a first step to tackle subjectivity.Comment: Camera-ready version of the paper submitted to SocialNLP 2018, with a fixed typ

    IberLEF 2021 Overview: Natural Language Processing for Iberian Languages

    Full text link
    [EN] IberLEF is a comparative evaluation campaign for Natural Language Processing Systems in Spanish and other Iberian languages. Its goal is to encourage the research community to organize competitive text processing, understanding and generation tasks in order to define new research challenges and set new state-of-the-art results in those languages. This paper summarizes the evaluation activities carried out in IberLEF 2021, which included twelve tasks dealing with emotions, stance and opinions, harmful information, health-related information extraction and discovery, humor and irony, and lexical acquisition. Overall, IberLEF activities were a remarkable collective effort involving 359 researchers from 22 countries in Europe, Asia and the Americas.The authors of this overview have been supported by the Spanish Government, Ministry of Science and Innovation, via research grants MISMIS (PGC2018- 096212-B), MISMIS-BIAS (PGC2018-096212-B-C32) and MISMISFAKEnHATE (PGC2018-096212-B-C31); and by CONACyT-Mexico project CB-2015-01- 257383 and the thematic networks program (Language Technologies Thematic Network).Gonzalo, J.; Montes-Y-GĂłmez, M.; Rosso, P. (2021). IberLEF 2021 Overview: Natural Language Processing for Iberian Languages. CEUR Workshop. 1-15. http://hdl.handle.net/10251/19056211

    TWITTIRÒ: an Italian Twitter Corpus with a Multi-layered Annotation for Irony

    Get PDF
    Provided the difficulties that still affect a correct identification of irony within the context of Sentiment Analysis tasks, in this paper we describe the main issues emerged during the development of a novel resource for Italian annotated for irony. The project mainly consists in the application on the Twitter corpus TWITTIRĂ’ of a multi-layered scheme for the fine-grained annotation of irony, as proposed in a multilingual setting and previously applied also on French and English datasets (Karoui et al. 2017). In applying the annotation on this corpus, we outline and discuss the issues and peculiarities emerged about the exploitation of the semantic scheme for Twitter textual messages in Italian, thus shedding some lights on the future directions that can be followed in the multilingual and cross-language perspective too. We present, in particular, an analysis of the annotation process and distribution of the labels of each layer involved in the scheme. This is supported by a discussion of the outcome of the annotation carried on by native Italian speakers in the development of the corpus. In particular, an in-depth discussion of the inter-annotator agreement and of the sources of disagreement is included. The result is a novel gold standard corpus for irony detection in Italian, which enriches the scenario of multilingual datasets available for this challenging task and is ready to be used as a benchmark in automatic irony detection experiments and evaluation campaigns
    • …
    corecore