2,398 research outputs found
EmoLabel: Semi-Automatic Methodology for Emotion Annotation of Social Media Text
The exponential growth of the amount of subjective information on the Web 2.0. has caused an increasing interest from researchers willing to develop methods to extract emotion data from these new sources. One of the most important challenges in textual emotion detection is the gathering of data with emotion labels because of the subjectivity of assigning these labels. Basing on this rationale, the main objective of our research is to contribute to the resolution of this important challenge. This is tackled by proposing EmoLabel: a semi-automatic methodology based on pre-annotation, which consists of two main phases: (1) an automatic process to pre-annotate the unlabelled English sentences; and (2) a manual process of refinement where human annotators determine which is the dominant emotion. Our objective is to assess the influence of this automatic pre-annotation method on manual emotion annotation from two points of view: agreement and time needed for annotation. The evaluation performed demonstrates the benefits of pre-annotation processes since the results on annotation time show a gain of near 20% when the pre-annotation process is applied (Pre-ML) without reducing annotator performance. Moreover, the benefits of pre-annotation are higher in those contributors whose performance is low (inaccurate annotators).This research has been supported by the Spanish Government (ref. RTI2018-094653-B-C22) and the Valencian Government (grant no. PROMETEU/2018/089). It has also been funded by the FPI grant (BES-2013-065950) and the research stay grant (EEBB-I-17-12578) from the Spanish Ministry of Science and Innovation
Inferring Affective Meanings of Words from Word Embedding
Affective lexicon is one of the most important resource in affective computing for text. Manually constructed affective lexicons have limited scale and thus only have limited use in practical systems. In this work, we propose a regression-based method to automatically infer multi-dimensional affective representation of words via their word embedding based on a set of seed words. This method can make use of the rich semantic meanings obtained from word embedding to extract meanings in some specific semantic space. This is based on the assumption that different features in word embedding contribute differently to a particular affective dimension and a particular feature in word embedding contributes differently to different affective dimensions. Evaluation on various affective lexicons shows that our method outperforms the state-of-the-art methods on all the lexicons under different evaluation metrics with large margins. We also explore different regression models and conclude that the Ridge regression model, the Bayesian Ridge regression model and Support Vector Regression with linear kernel are the most suitable models. Comparing to other state-of-the-art methods, our method also has computation advantage. Experiments on a sentiment analysis task show that the lexicons extended by our method achieve better results than publicly available sentiment lexicons on eight sentiment corpora. The extended lexicons are publicly available for access
Role of sentiment classification in sentiment analysis: a survey
Through a survey of literature, the role of sentiment classification in sentiment analysis has been reviewed. The review identifies the research challenges involved in tackling sentiment classification. A total of 68 articles during 2015 – 2017 have been reviewed on six dimensions viz., sentiment classification, feature extraction, cross-lingual sentiment classification, cross-domain sentiment classification, lexica and corpora creation and multi-label sentiment classification. This study discusses the prominence and effects of sentiment classification in sentiment evaluation and a lot of further research needs to be done for productive results
Basic tasks of sentiment analysis
Subjectivity detection is the task of identifying objective and subjective
sentences. Objective sentences are those which do not exhibit any sentiment.
So, it is desired for a sentiment analysis engine to find and separate the
objective sentences for further analysis, e.g., polarity detection. In
subjective sentences, opinions can often be expressed on one or multiple
topics. Aspect extraction is a subtask of sentiment analysis that consists in
identifying opinion targets in opinionated text, i.e., in detecting the
specific aspects of a product or service the opinion holder is either praising
or complaining about
Semantic Sentiment Analysis of Twitter Data
Internet and the proliferation of smart mobile devices have changed the way
information is created, shared, and spreads, e.g., microblogs such as Twitter,
weblogs such as LiveJournal, social networks such as Facebook, and instant
messengers such as Skype and WhatsApp are now commonly used to share thoughts
and opinions about anything in the surrounding world. This has resulted in the
proliferation of social media content, thus creating new opportunities to study
public opinion at a scale that was never possible before. Naturally, this
abundance of data has quickly attracted business and research interest from
various fields including marketing, political science, and social studies,
among many others, which are interested in questions like these: Do people like
the new Apple Watch? Do Americans support ObamaCare? How do Scottish feel about
the Brexit? Answering these questions requires studying the sentiment of
opinions people express in social media, which has given rise to the fast
growth of the field of sentiment analysis in social media, with Twitter being
especially popular for research due to its scale, representativeness, variety
of topics discussed, as well as ease of public access to its messages. Here we
present an overview of work on sentiment analysis on Twitter.Comment: Microblog sentiment analysis; Twitter opinion mining; In the
Encyclopedia on Social Network Analysis and Mining (ESNAM), Second edition.
201
Sentiment analytics: Lexicons construction and analysis
With the increasing amount of text data, sentiment analysis (SA) is becoming more and more important. An automated approach is needed to parse the online reviews and comments, and analyze their sentiments. Since lexicon is the most important component in SA, enhancing the quality of lexicons will improve the efficiency and accuracy of sentiment analysis. In this research, the effect of coupling a general lexicon with a specialized lexicon (for a specific domain) and its impact on sentiment analysis was presented. Two special domains and one general domain were studied. The two special domains are the petroleum domain and the biology domain. The general domain is the social network domain. The specialized lexicon for the petroleum domain was created as part of this research. The results, as expected, show that coupling a general lexicon with a specialized lexicon improves the sentiment analysis. However, coupling a general lexicon with another general lexicon does not improve the sentiment analysis --Abstract, page iii
Building and evaluating resources for sentiment analysis in the Greek language
Sentiment lexicons and word embeddings constitute well-established sources of information for sentiment analysis in online social media. Although their effectiveness has been demonstrated in state-of-the-art sentiment analysis and related tasks in the English language, such publicly available resources are much less developed and evaluated for the Greek language. In this paper, we tackle the problems arising when analyzing text in such an under-resourced language. We present and make publicly available a rich set of such resources, ranging from a manually annotated lexicon, to semi-supervised word embedding vectors and annotated datasets for different tasks. Our experiments using different algorithms and parameters on our resources show promising results over standard baselines; on average, we achieve a 24.9% relative improvement in F-score on the cross-domain sentiment analysis task when training the same algorithms with our resources, compared to training them on more traditional feature sources, such as n-grams. Importantly, while our resources were built with the primary focus on the cross-domain sentiment analysis task, they also show promising results in related tasks, such as emotion analysis and sarcasm detection
- …