184 research outputs found
General Sentiment Decomposition: opinion mining based on raw Natural Language text
The importance of person-to-person communication about a certain topic (Word of Mouth) is growing day by day, especially for decision-makers. These phenomena can be directly observed in online social networks. For example, the rise of influencers and social media managers. If more people talk about a specific product, then more people are encouraged to buy it and vice versa. Forby, those people usually leave a review for it. Such a review will directly impact the product, and this effect is amplified proportionally to how much the reviewer is considered to be trustworthy by the potential new customer. Furthermore, considering the negative reporting bias, it is easy to understand how customer satisfaction is of absolute interest for a company (as well as citizens' trust is for a politician).
Textual data have then proved extremely useful, but they are complex, as the language is. For that, many approaches focus more on producing well-performing classifiers and ignore the highly complex interpretability of their models. Instead, we propose a framework able to produce a good sentiment classifier with a particular focus on the model interpretability. After analyzing the impact of Word of Mouth on earnings and the related psychological aspects, we propose an algorithm to extract the sentiment from a Natural Language text corpus. The combined approach of Neural Networks, characterized by high predictive power but at the cost of complex interpretation (usually considered as black-boxes), with more straightforward and informative models, allows not only to predict how much a sentence is positive (negative) but also to quantify a sentiment with a numeric value. In fact, the General Sentiment Decomposition (GSD) framework that we propose is based on a combination of Threshold-based Naive Bayes (an improved version of the original algorithm), SentiWordNet (an enriched Lexical Database for Sentiment Analysis tasks), and the Words Embeddings features (a high dimensional representation of words) that directly comes from the usage of Neural Networks.
Moreover, using the GSD framework, we assess an objective sentiment scoring that improves the results' interpretation in many fields. For example, it is possible to identify specific critical sectors that require intervention to improve the offered services, find the company's strengths (useful for advertising campaigns), and, if time information is present, analyze trends on macro/micro topics.
Besides, we have to consider that NL text data can be associated (or not) with a sentiment label, for example: 'positive' or 'negative'. To support further decision-making, we apply the proposed method to labeled (Booking.com, TripAdvisor.com) and unlabelled (Twitter.com) data, analyzing the sentiment of people who discuss a particular issue. In this way, we identify the aspects perceived as critical by the people concerning the "feedback" they publish on the web and quantify how happy (or not) they are about a specific problem. In particular, for Booking.com and TripAdvisor.com, we focus on customer satisfaction, whilst for Twitter.com, the main topic is climate change
Conceptual Sentiment Analysis Model
Bag-of-words approach is popularly used for Sentiment analysis. It maps the terms in the reviews to term-document vectors and thus disrupts the syntactic structure of sentences in the reviews. Association among the terms or the semantic structure of sentences is also not preserved. This research work focuses on classifying the sentiments by considering the syntactic and semantic structure of the sentences in the review. To improve accuracy, sentiment classifiers based on relative frequency, average frequency and term frequency inverse document frequency were proposed. To handle terms with apostrophe, preprocessing techniques were extended. To focus on opinionated contents, subjectivity extraction was performed at phrase level. Experiments were performed on Pang & Lees, Kaggle’s and UCI’s dataset. Classifiers were also evaluated on the UCI’s Product and Restaurant dataset. Sentiment Classification accuracy improved from 67.9% for a comparable term weighing technique, DeltaTFIDF, up to 77.2% for proposed classifiers. Inception of the proposed concept based approach, subjectivity extraction and extensions to preprocessing techniques, improved the accuracy to 93.9%
Hybrid Approach for Emotion Classification of Audio Conversation Based on Text and Speech Mining
AbstractOne of the greatest challenges in speech technology is estimating the speaker's emotion. Most of the existing approaches concentrate either on audio or text features. In this work, we propose a novel approach for emotion classification of audio conversation based on both speech and text. The novelty in this approach is in the choice of features and the generation of a single feature vector for classification. Our main intention is to increase the accuracy of emotion classification of speech by considering both audio and text features. In this work we use standard methods such as Natural Language Processing, Support Vector Machines, WordNet Affect and SentiWordNet. The dataset for this work have been taken from Semval -2007 and eNTERFACE’05 EMOTION Database
Sentiment Classification of Online Customer Reviews and Blogs Using Sentence-level Lexical Based Semantic Orientation Method
ABSTRACT
Sentiment analysis is the process of extracting knowledge from the peoples‟ opinions, appraisals and emotions toward entities, events and their attributes. These opinions
greatly impact on customers to ease their choices regarding online shopping, choosing events, products and entities. With the rapid growth of online resources, a vast amount
of new data in the form of customer reviews and opinions are being generated progressively. Hence, sentiment analysis methods are desirable for developing
efficient and effective analyses and classification of customer reviews, blogs and
comments.
The main inspiration for this thesis is to develop high performance domain
independent sentiment classification method. This study focuses on sentiment analysis
at the sentence level using lexical based method for different type data such as
reviews and blogs. The proposed method is based on general lexicons i.e. WordNet,
SentiWordNet and user defined lexical dictionaries for sentiment orientation. The
relations and glosses of these dictionaries provide solution to the domain portability problem. The experiments are performed on various data sets such as customer reviews and blogs comments. The results show that the proposed method with sentence contextual information is effective for sentiment classification. The proposed method performs better than word and text level corpus based machine learning methods for semantic orientation. The results highlight that the proposed method achieves an average accuracy of 86% at sentence-level and 97% at feedback level for customer reviews. Similarly, it achieves an average accuracy of 83% at sentence level and 86% at
feedback level for blog comment
Role of sentiment classification in sentiment analysis: a survey
Through a survey of literature, the role of sentiment classification in sentiment analysis has been reviewed. The review identifies the research challenges involved in tackling sentiment classification. A total of 68 articles during 2015 – 2017 have been reviewed on six dimensions viz., sentiment classification, feature extraction, cross-lingual sentiment classification, cross-domain sentiment classification, lexica and corpora creation and multi-label sentiment classification. This study discusses the prominence and effects of sentiment classification in sentiment evaluation and a lot of further research needs to be done for productive results
A study on text-score disagreement in online reviews
In this paper, we focus on online reviews and employ artificial intelligence
tools, taken from the cognitive computing field, to help understanding the
relationships between the textual part of the review and the assigned numerical
score. We move from the intuitions that 1) a set of textual reviews expressing
different sentiments may feature the same score (and vice-versa); and 2)
detecting and analyzing the mismatches between the review content and the
actual score may benefit both service providers and consumers, by highlighting
specific factors of satisfaction (and dissatisfaction) in texts.
To prove the intuitions, we adopt sentiment analysis techniques and we
concentrate on hotel reviews, to find polarity mismatches therein. In
particular, we first train a text classifier with a set of annotated hotel
reviews, taken from the Booking website. Then, we analyze a large dataset, with
around 160k hotel reviews collected from Tripadvisor, with the aim of detecting
a polarity mismatch, indicating if the textual content of the review is in
line, or not, with the associated score.
Using well established artificial intelligence techniques and analyzing in
depth the reviews featuring a mismatch between the text polarity and the score,
we find that -on a scale of five stars- those reviews ranked with middle scores
include a mixture of positive and negative aspects.
The approach proposed here, beside acting as a polarity detector, provides an
effective selection of reviews -on an initial very large dataset- that may
allow both consumers and providers to focus directly on the review subset
featuring a text/score disagreement, which conveniently convey to the user a
summary of positive and negative features of the review target.Comment: This is the accepted version of the paper. The final version will be
published in the Journal of Cognitive Computation, available at Springer via
http://dx.doi.org/10.1007/s12559-017-9496-
- …