605 research outputs found
Classifying sentiment in microblogs: is brevity an advantage?
Microblogs as a new textual domain offer a unique proposition for sentiment analysis. Their short document length suggests any sentiment they contain is compact and explicit. However, this short length coupled with their noisy nature can pose difficulties for standard machine learning document representations. In this work we examine the hypothesis that it is easier to classify the sentiment in these short form documents than in longer form documents. Surprisingly, we find classifying sentiment in microblogs easier than in blogs and make a number of observations pertaining to the challenge of supervised learning for sentiment analysis in microblogs
Semantic Sentiment Analysis of Twitter Data
Internet and the proliferation of smart mobile devices have changed the way
information is created, shared, and spreads, e.g., microblogs such as Twitter,
weblogs such as LiveJournal, social networks such as Facebook, and instant
messengers such as Skype and WhatsApp are now commonly used to share thoughts
and opinions about anything in the surrounding world. This has resulted in the
proliferation of social media content, thus creating new opportunities to study
public opinion at a scale that was never possible before. Naturally, this
abundance of data has quickly attracted business and research interest from
various fields including marketing, political science, and social studies,
among many others, which are interested in questions like these: Do people like
the new Apple Watch? Do Americans support ObamaCare? How do Scottish feel about
the Brexit? Answering these questions requires studying the sentiment of
opinions people express in social media, which has given rise to the fast
growth of the field of sentiment analysis in social media, with Twitter being
especially popular for research due to its scale, representativeness, variety
of topics discussed, as well as ease of public access to its messages. Here we
present an overview of work on sentiment analysis on Twitter.Comment: Microblog sentiment analysis; Twitter opinion mining; In the
Encyclopedia on Social Network Analysis and Mining (ESNAM), Second edition.
201
Using Word Embeddings in Twitter Election Classification
Word embeddings and convolutional neural networks (CNN)
have attracted extensive attention in various classification
tasks for Twitter, e.g. sentiment classification. However,
the effect of the configuration used to train and generate
the word embeddings on the classification performance has
not been studied in the existing literature. In this paper,
using a Twitter election classification task that aims to detect
election-related tweets, we investigate the impact of
the background dataset used to train the embedding models,
the context window size and the dimensionality of word
embeddings on the classification performance. By comparing
the classification results of two word embedding models,
which are trained using different background corpora
(e.g. Wikipedia articles and Twitter microposts), we show
that the background data type should align with the Twitter
classification dataset to achieve a better performance. Moreover,
by evaluating the results of word embeddings models
trained using various context window sizes and dimensionalities,
we found that large context window and dimension
sizes are preferable to improve the performance. Our experimental
results also show that using word embeddings and
CNN leads to statistically significant improvements over various
baselines such as random, SVM with TF-IDF and SVM
with word embeddings
- …