1,877 research outputs found

    "With 1 follower I must be AWESOME :P". Exploring the role of irony markers in irony recognition

    Full text link
    Conversations in social media often contain the use of irony or sarcasm, when the users say the opposite of what they really mean. Irony markers are the meta-communicative clues that inform the reader that an utterance is ironic. We propose a thorough analysis of theoretically grounded irony markers in two social media platforms: TwitterTwitter and RedditReddit. Classification and frequency analysis show that for TwitterTwitter, typographic markers such as emoticons and emojis are the most discriminative markers to recognize ironic utterances, while for RedditReddit the morphological markers (e.g., interjections, tag questions) are the most discriminative.Comment: ICWSM 201

    Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold

    Get PDF
    Sentiment analysis over Twitter offers organisations and individuals a fast and effective way to monitor the publics' feelings towards them and their competitors. To assess the performance of sentiment analysis methods over Twitter a small set of evaluation datasets have been released in the last few years. In this paper we present an overview of eight publicly available and manually annotated evaluation datasets for Twitter sentiment analysis. Based on this review, we show that a common limitation of most of these datasets, when assessing sentiment analysis at target (entity) level, is the lack of distinctive sentiment annotations among the tweets and the entities contained in them. For example, the tweet "I love iPhone, but I hate iPad" can be annotated with a mixed sentiment label, but the entity iPhone within this tweet should be annotated with a positive sentiment label. Aiming to overcome this limitation, and to complement current evaluation datasets, we present STS-Gold, a new evaluation dataset where tweets and targets (entities) are annotated individually and therefore may present different sentiment labels. This paper also provides a comparative study of the various datasets along several dimensions including: total number of tweets, vocabulary size and sparsity. We also investigate the pair-wise correlation among these dimensions as well as their correlations to the sentiment classification performance on different datasets
    • …
    corecore