439 research outputs found
Sentiment Lexicon Adaptation with Context and Semantics for the Social Web
Sentiment analysis over social streams offers governments and organisations a fast and effective way to monitor the publics' feelings towards policies, brands, business, etc. General purpose sentiment lexicons have been used to compute sentiment from social streams, since they are simple and effective. They calculate the overall sentiment of texts by using a general collection of words, with predetermined sentiment orientation and strength. However, words' sentiment often vary with the contexts in which they appear, and new words might be encountered that are not covered by the lexicon, particularly in social media environments where content emerges and changes rapidly and constantly. In this paper, we propose a lexicon adaptation approach that uses contextual as well as semantic information extracted from DBPedia to update the words' weighted sentiment orientations and to add new words to the lexicon. We evaluate our approach on three different Twitter datasets, and show that enriching the lexicon with contextual and semantic information improves sentiment computation by 3.4% in average accuracy, and by 2.8% in average F1 measure
Recommended from our members
A Linked Open Data Approach for Sentiment Lexicon Adaptation
Social media platforms have recently become a gold mine for organisations to monitor their reputation by extracting and analysing the sentiment of the posts generated about them, their markets, and competitors. Among the approaches to analyse sentiment from social media, approaches based on sentiment lexicons (sets of words with associated sentiment scores) have gained popularity since they do not rely on training data, as opposed to Machine Learning approaches. However, sentiment lexicons consider a static sentiment score for each word without taking into consideration the different contexts in which the word is used (e.g, great problem vs. great smile). Additionally, new words constantly emerge from dynamic and rapidly changing social media environments that may not be covered by the lexicons. In this paper we propose a lexicon adaptation approach that makes use of semantic relations extracted from DBpedia to better understand the various contextual scenarios in which words are used. We evaluate our approach on three different Twitter datasets and show that using semantic information to adapt the lexicon improves sentiment computation by 3.7% in average accuracy, and by 2.6% in average F1 measure
Recommended from our members
Semantic Sentiment Analysis of Microblogs
Microblogs and social media platforms are now considered among the most popular forms of online communication. Through a platform like Twitter, much information reflecting people's opinions and attitudes is published and shared among users on a daily basis. This has recently brought great opportunities to companies interested in tracking and monitoring the reputation of their brands and businesses, and to policy makers and politicians to support their assessment of public opinions about their policies or political issues.
A wide range of approaches to sentiment analysis on Twitter, and other similar microblogging platforms, have been recently built. Most of these approaches rely mainly on the presence of affect words or syntactic structures that explicitly and unambiguously reflect sentiment (e.g., "great'', "terrible''). However, these approaches are semantically weak, that is, they do not account for the semantics of words when detecting their sentiment in text. This is problematic since the sentiment of words, in many cases, is associated with their semantics, either along the context they occur within (e.g., "great'' is negative in the context "pain'') or the conceptual meaning associated with the words (e.g., "Ebola" is negative when its associated semantic concept is "Virus").
This thesis investigates the role of words' semantics in sentiment analysis of microblogs, aiming mainly at addressing the above problem. In particular, Twitter is used as a case study of microblogging platforms to investigate whether capturing the sentiment of words with respect to their semantics leads to more accurate sentiment analysis models on Twitter. To this end, several approaches are proposed in this thesis for extracting and incorporating two types of word semantics for sentiment analysis: contextual semantics (i.e., semantics captured from words' co-occurrences) and conceptual semantics (i.e., semantics extracted from external knowledge sources).
Experiments are conducted with both types of semantics by assessing their impact in three popular sentiment analysis tasks on Twitter; entity-level sentiment analysis, tweet-level sentiment analysis and context-sensitive sentiment lexicon adaptation. Evaluation under each sentiment analysis task includes several sentiment lexicons, and up to 9 Twitter datasets of different characteristics, as well as comparing against several state-of-the-art sentiment analysis approaches widely used in the literature.
The findings from this body of work demonstrate the value of using semantics in sentiment analysis on Twitter. The proposed approaches, which consider words' semantics for sentiment analysis at both, entity and tweet levels, surpass non-semantic approaches in most datasets
Role of sentiment classification in sentiment analysis: a survey
Through a survey of literature, the role of sentiment classification in sentiment analysis has been reviewed. The review identifies the research challenges involved in tackling sentiment classification. A total of 68 articles during 2015 – 2017 have been reviewed on six dimensions viz., sentiment classification, feature extraction, cross-lingual sentiment classification, cross-domain sentiment classification, lexica and corpora creation and multi-label sentiment classification. This study discusses the prominence and effects of sentiment classification in sentiment evaluation and a lot of further research needs to be done for productive results
Contextual lexicon-based sentiment analysis for social media.
Sentiment analysis concerns the computational study of opinions expressed in text. Social media domains provide a wealth of opinionated data, thus, creating a greater need for sentiment analysis. Typically, sentiment lexicons that capture term-sentiment association knowledge are commonly used to develop sentiment analysis systems. However, the nature of social media content calls for analysis methods and knowledge sources that are better able to adapt to changing vocabulary. Invariably existing sentiment lexicon knowledge cannot usefully handle social media vocabulary which is typically informal and changeable yet rich in sentiment. This, in turn, has implications on the analyser's ability to effectively capture the context therein and to interpret the sentiment polarity from the lexicons. In this thesis we use SentiWordNet, a popular sentiment-rich lexicon with a substantial vocabulary coverage and explore how to adapt it for social media sentiment analysis. Firstly, the thesis identifies a set of strategies to incorporate the effect of modifiers on sentiment-bearing terms (local context). These modifiers include: contextual valence shifters, non-lexical sentiment modifiers typical in social media and discourse structures. Secondly, the thesis introduces an approach in which a domain-specific lexicon is generated using a distant supervision method and integrated with a general-purpose lexicon, using a weighted strategy, to form a hybrid (domain-adapted) lexicon. This has the dual purpose of enriching term coverage of the general purpose lexicon with non-standard but sentiment-rich terms as well as adjusting sentiment semantics of terms. Here, we identified two term-sentiment association metrics based on Term Frequency and Inverse Document Frequency that are able to outperform the state-of-the-art Point-wise Mutual Information on social media data. As distant supervision may not be readily applicable on some social media domains, we explore the cross-domain transferability of a hybrid lexicon. Thirdly, we introduce an approach for improving distant-supervised sentiment classification with knowledge from local context analysis, domain-adapted (hybrid) and emotion lexicons. Finally, we conduct a comprehensive evaluation of all identified approaches using six sentiment-rich social media datasets
Three Essays on Opinion Mining of Social Media Texts
This dissertation research is a collection of three essays on opinion mining of social media texts. I explore different theoretical and methodological perspectives in this inquiry. The first essay focuses on improving lexicon-based sentiment classification. I propose a method to automatically generate a sentiment lexicon that incorporates knowledge from both the language domain and the content domain. This method learns word associations from a large unannotated corpus. These associations are used to identify new sentiment words. Using a Twitter data set containing 743,069 tweets related to the stock market, I show that the sentiment lexicons generated using the proposed method significantly outperforms existing sentiment lexicons in sentiment classification. As sentiment analysis is being applied to different types of documents to solve different problems, the proposed method provides a useful tool to improve sentiment classification.
The second essay focuses on improving supervised sentiment classification. In previous work on sentiment classification, a document was typically represented as a collection of single words. This method of feature representation suffers from severe ambiguity, especially in classifying short texts, such as microblog messages. I propose the use of dependency features in sentiment classification. A dependency describes the relationship between a pair of words even when they are distant. I compare the sentiment classification performance of dependency features with a few commonly used features in different experiment settings. The results show that dependency features significantly outperform existing feature representations.
In the third essay, I examine the relationship between social media sentiment and stock returns. This is the first study to test the bidirectional effects in this relationship. Based on theories in behavioral finance research, I speculate that social media sentiment does not predict stock return, but rather that stock return predicts social media sentiment. I empirically test a set of research hypotheses by applying the vector autoregression (VAR) model on a social media data set, which is much larger than those used in previous studies. The hypotheses are supported by the results. The findings have significant implications for both theory and practice
Sentiment Analysis in Social Streams
In this chapter we review and discuss the state of the art on sentiment analysis in social streams –such as web forums, micro-blogging systems, and so- cial networks–, aiming to clarify how user opinions, affective states, and intended emotional effects are extracted from user generated content, how they are modeled, and how they could be finally exploited. We explain why sentiment analysis tasks are more difficult for social streams than for other textual sources, and entail going beyond classic text-based opinion mining techniques. We show, for example, that social streams may use vocabularies and expressions that exist outside the main- stream of standard, formal languages, and may reflect complex dynamics in the opinions and sentiments expressed by individuals and communities
Sentiment Analysis in Social Streams
In this chapter, we review and discuss the state of the art on sentiment
analysis in social streams—such as web forums, microblogging systems, and social
networks, aiming to clarify how user opinions, affective states, and intended emo tional effects are extracted from user generated content, how they are modeled, and
howthey could be finally exploited.We explainwhy sentiment analysistasks aremore
difficult for social streams than for other textual sources, and entail going beyond
classic text-based opinion mining techniques. We show, for example, that social
streams may use vocabularies and expressions that exist outside the mainstream of
standard, formal languages, and may reflect complex dynamics in the opinions and
sentiments expressed by individuals and communities
- …