4,538 research outputs found
Active learning in annotating micro-blogs dealing with e-reputation
Elections unleash strong political views on Twitter, but what do people
really think about politics? Opinion and trend mining on micro blogs dealing
with politics has recently attracted researchers in several fields including
Information Retrieval and Machine Learning (ML). Since the performance of ML
and Natural Language Processing (NLP) approaches are limited by the amount and
quality of data available, one promising alternative for some tasks is the
automatic propagation of expert annotations. This paper intends to develop a
so-called active learning process for automatically annotating French language
tweets that deal with the image (i.e., representation, web reputation) of
politicians. Our main focus is on the methodology followed to build an original
annotated dataset expressing opinion from two French politicians over time. We
therefore review state of the art NLP-based ML algorithms to automatically
annotate tweets using a manual initiation step as bootstrap. This paper focuses
on key issues about active learning while building a large annotated data set
from noise. This will be introduced by human annotators, abundance of data and
the label distribution across data and entities. In turn, we show that Twitter
characteristics such as the author's name or hashtags can be considered as the
bearing point to not only improve automatic systems for Opinion Mining (OM) and
Topic Classification but also to reduce noise in human annotations. However, a
later thorough analysis shows that reducing noise might induce the loss of
crucial information.Comment: Journal of Interdisciplinary Methodologies and Issues in Science -
Vol 3 - Contextualisation digitale - 201
Universal, Unsupervised (Rule-Based), Uncovered Sentiment Analysis
We present a novel unsupervised approach for multilingual sentiment analysis
driven by compositional syntax-based rules. On the one hand, we exploit some of
the main advantages of unsupervised algorithms: (1) the interpretability of
their output, in contrast with most supervised models, which behave as a black
box and (2) their robustness across different corpora and domains. On the other
hand, by introducing the concept of compositional operations and exploiting
syntactic information in the form of universal dependencies, we tackle one of
their main drawbacks: their rigidity on data that are structured differently
depending on the language concerned. Experiments show an improvement both over
existing unsupervised methods, and over state-of-the-art supervised models when
evaluating outside their corpus of origin. Experiments also show how the same
compositional operations can be shared across languages. The system is
available at http://www.grupolys.org/software/UUUSA/Comment: 19 pages, 5 Tables, 6 Figures. This is the authors version of a work
that was accepted for publication in Knowledge-Based System
Towards Syntactic Iberian Polarity Classification
Lexicon-based methods using syntactic rules for polarity classification rely
on parsers that are dependent on the language and on treebank guidelines. Thus,
rules are also dependent and require adaptation, especially in multilingual
scenarios. We tackle this challenge in the context of the Iberian Peninsula,
releasing the first symbolic syntax-based Iberian system with rules shared
across five official languages: Basque, Catalan, Galician, Portuguese and
Spanish. The model is made available.Comment: 7 pages, 5 tables. Contribution to the 8th Workshop on Computational
Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA-2017)
at EMNLP 201
Semantic Sentiment Analysis of Twitter Data
Internet and the proliferation of smart mobile devices have changed the way
information is created, shared, and spreads, e.g., microblogs such as Twitter,
weblogs such as LiveJournal, social networks such as Facebook, and instant
messengers such as Skype and WhatsApp are now commonly used to share thoughts
and opinions about anything in the surrounding world. This has resulted in the
proliferation of social media content, thus creating new opportunities to study
public opinion at a scale that was never possible before. Naturally, this
abundance of data has quickly attracted business and research interest from
various fields including marketing, political science, and social studies,
among many others, which are interested in questions like these: Do people like
the new Apple Watch? Do Americans support ObamaCare? How do Scottish feel about
the Brexit? Answering these questions requires studying the sentiment of
opinions people express in social media, which has given rise to the fast
growth of the field of sentiment analysis in social media, with Twitter being
especially popular for research due to its scale, representativeness, variety
of topics discussed, as well as ease of public access to its messages. Here we
present an overview of work on sentiment analysis on Twitter.Comment: Microblog sentiment analysis; Twitter opinion mining; In the
Encyclopedia on Social Network Analysis and Mining (ESNAM), Second edition.
201
SemEval-2016 task 5 : aspect based sentiment analysis
International audienceThis paper describes the SemEval 2016 shared task on Aspect Based Sentiment Analysis (ABSA), a continuation of the respective tasks of 2014 and 2015. In its third year, the task provided 19 training and 20 testing datasets for 8 languages and 7 domains, as well as a common evaluation procedure. From these datasets, 25 were for sentence-level and 14 for text-level ABSA; the latter was introduced for the first time as a subtask in SemEval. The task attracted 245 submissions from 29 teams
- âŠ