2,080 research outputs found
Large-Scale Goodness Polarity Lexicons for Community Question Answering
We transfer a key idea from the field of sentiment analysis to a new domain:
community question answering (cQA). The cQA task we are interested in is the
following: given a question and a thread of comments, we want to re-rank the
comments so that the ones that are good answers to the question would be ranked
higher than the bad ones. We notice that good vs. bad comments use specific
vocabulary and that one can often predict the goodness/badness of a comment
even ignoring the question, based on the comment contents only. This leads us
to the idea to build a good/bad polarity lexicon as an analogy to the
positive/negative sentiment polarity lexicons, commonly used in sentiment
analysis. In particular, we use pointwise mutual information in order to build
large-scale goodness polarity lexicons in a semi-supervised manner starting
with a small number of initial seeds. The evaluation results show an
improvement of 0.7 MAP points absolute over a very strong baseline and
state-of-the art performance on SemEval-2016 Task 3.Comment: SIGIR '17, August 07-11, 2017, Shinjuku, Tokyo, Japan; Community
Question Answering; Goodness polarity lexicons; Sentiment Analysi
TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification
This paper describes the participation of the team "TwiSE" in the SemEval
2016 challenge. Specifically, we participated in Task 4, namely "Sentiment
Analysis in Twitter" for which we implemented sentiment classification systems
for subtasks A, B, C and D. Our approach consists of two steps. In the first
step, we generate and validate diverse feature sets for twitter sentiment
evaluation, inspired by the work of participants of previous editions of such
challenges. In the second step, we focus on the optimization of the evaluation
measures of the different subtasks. To this end, we examine different learning
strategies by validating them on the data provided by the task organisers. For
our final submissions we used an ensemble learning approach (stacked
generalization) for Subtask A and single linear models for the rest of the
subtasks. In the official leaderboard we were ranked 9/35, 8/19, 1/11 and 2/14
for subtasks A, B, C and D respectively.\footnote{We make the code available
for research purposes at
\url{https://github.com/balikasg/SemEval2016-Twitter\_Sentiment\_Evaluation}.
UG18 at SemEval-2018 Task 1: Generating Additional Training Data for Predicting Emotion Intensity in Spanish
The present study describes our submission to SemEval 2018 Task 1: Affect in
Tweets. Our Spanish-only approach aimed to demonstrate that it is beneficial to
automatically generate additional training data by (i) translating training
data from other languages and (ii) applying a semi-supervised learning method.
We find strong support for both approaches, with those models outperforming our
regular models in all subtasks. However, creating a stepwise ensemble of
different models as opposed to simply averaging did not result in an increase
in performance. We placed second (EI-Reg), second (EI-Oc), fourth (V-Reg) and
fifth (V-Oc) in the four Spanish subtasks we participated in.Comment: Accepted at SemEval 201
Semantic Sentiment Analysis of Twitter Data
Internet and the proliferation of smart mobile devices have changed the way
information is created, shared, and spreads, e.g., microblogs such as Twitter,
weblogs such as LiveJournal, social networks such as Facebook, and instant
messengers such as Skype and WhatsApp are now commonly used to share thoughts
and opinions about anything in the surrounding world. This has resulted in the
proliferation of social media content, thus creating new opportunities to study
public opinion at a scale that was never possible before. Naturally, this
abundance of data has quickly attracted business and research interest from
various fields including marketing, political science, and social studies,
among many others, which are interested in questions like these: Do people like
the new Apple Watch? Do Americans support ObamaCare? How do Scottish feel about
the Brexit? Answering these questions requires studying the sentiment of
opinions people express in social media, which has given rise to the fast
growth of the field of sentiment analysis in social media, with Twitter being
especially popular for research due to its scale, representativeness, variety
of topics discussed, as well as ease of public access to its messages. Here we
present an overview of work on sentiment analysis on Twitter.Comment: Microblog sentiment analysis; Twitter opinion mining; In the
Encyclopedia on Social Network Analysis and Mining (ESNAM), Second edition.
201
Opinion Mining on Non-English Short Text
As the type and the number of such venues increase, automated analysis of
sentiment on textual resources has become an essential data mining task. In
this paper, we investigate the problem of mining opinions on the collection of
informal short texts. Both positive and negative sentiment strength of texts
are detected. We focus on a non-English language that has few resources for
text mining. This approach would help enhance the sentiment analysis in
languages where a list of opinionated words does not exist. We propose a new
method projects the text into dense and low dimensional feature vectors
according to the sentiment strength of the words. We detect the mixture of
positive and negative sentiments on a multi-variant scale. Empirical evaluation
of the proposed framework on Turkish tweets shows that our approach gets good
results for opinion mining
- …