317 research outputs found

    Techniques for improving the performance of unsupervised approach to sentiment analysis

    Get PDF
    In this work, few techniques were proposed to enhance the performance of unsupervised sentiment analysis method to categorize review reports into sentiment orientations (positive and negative). In review reports, generally negations can change the polarity of other terms in a sentence. Therefore, a new technique for handling negations was proposed. As it is seen that, the positions of terms in a report are also important i.e. the same term appearing at different positions in a report may convey different amount of sentiments. Thus, a new technique was proposed to assign weights to the terms depending on their positions of occurrences within a review. Again, another technique was proposed to use the presence of exclamatory marks in the reviews as the effects of exclamatory marks are equally important in categorizing review reports. After incorporating all these concepts in the first phase of the proposed method, in the second phase, analysis of sentiment orientations was done using cluster ensemble method. The proposed method was tested on a state-of-the-art Movie review dataset and 91.75% accuracy was achieved. A significant improvement over some of the unsupervised and supervised methods in terms of accuracy was achieved with incorporation of the new techniques

    Automatically generating a sentiment lexicon for the Malay language

    Get PDF
    This paper aims to propose an automated sentiment lexicon generation model specifically designed for the Malay language. Lexicon-based Sentiment Analysis (SA) models make use of a sentiment lexicon for SA tasks, which is a linguistic resource that comprises a priori information about the sentiment properties of words. A sentiment lexicon is an indispensable resource for SA tasks. This is evident in the emergence of a large volume of research focused on the development of sentiment lexicon generation algorithms. This is not the case for low-resource languages such as Malay, for which there is a lack of research focused on this particular area. This has brought up the motivation to propose a sentiment lexicon generation algorithm for this language. WordNet Bahasa was first mapped onto the English WordNet to construct a multilingual word network. A seed set of prototypical positive and negative terms was then automatically expanded by recursively adding terms linked via WordNet’s synonymy and antonymy semantic relations. The underlying intuition is that the sentiment properties of newly added terms via these relations are preserved. A supervised classifier was employed for the word-polarity tagging task, with textual representations of the expanded seed set as features. Evaluation of the model against the General Inquirer lexicon as a benchmark demonstrates that it performs with reasonable accuracy. This paper aims to provide a foundation for further research for the Malay language in this area

    Twitter Sentiment Mining: A Multi Domain Analysis

    Get PDF
    Microblogging such as Twitter provides a rich source of information about products, personalities, and trends, etc. We proposed a simple methodology for analyzing sentiment of users in Twitter. First, we automatically collected Twitter corpus in positive and negative tweets. Second, we built a simple sentiment classifier by utilizing the Naive Bayes model to determine the positive and negative sentiment of a tweet. Third, we tested the classifier against a collection of users’ opinions from five interesting domains of Twitter, i.e., news, finance, job, movies, and sport. The experimental results show that it is feasible to use Twitter corpus alone to classify new tweet for a certain domain applications

    Fine-grained sentiment analysis for measuring customer satisfaction using an extended set of fuzzy linguistic hedges

    Get PDF
    © 2020 The Authors. Published by Atlantis Press SARL. In recent years, the boom in social media sites such as Facebook and Twitter has brought people together for the sharing of opinions, sentiments, emotions, and experiences about products, events, politics, and other topics. In particular, sentiment-based applications are growing in popularity among individuals and businesses for the making of purchase decisions. Fuzzy-based sentiment analysis aims at classifying customer sentiment at a fine-grained level. This study deals with the development of a fuzzy-based sentiment analysis by extending fuzzy hedges and rule-sets for a more efficient classification of customer sentiment and satisfaction. Prior studies have used a limited number of linguistic hedges and polarity classes in their rule-sets, resulting in the degraded efficiency of their fuzzy-based sentiment analysis systems. The proposed analysis of the current study classifies customer reviews using fuzzy linguistic hedges and an extended rule-set with seven sentiment analysis classes, namely extremely positive, very positive, positive, neutral, negative, very negative, and extremely negative. Then, a fuzzy logic system is applied to measure customer satisfaction at a fine-grained level. The experimental results demonstrate that the proposed analysis has an improved performance over the baseline works

    Tapping into sociological lexicons for sentiment polarity classification

    Full text link
    Sentiment Analysis, or the extraction of emotional content from text, has been a prominent research topic for a decade. Numerous annotated lexicons have been created for identification and classification of emotions (or affect) in text. This extraction of emotional content from text makes possible emotion-aware Information Retrieval, which is especially important with the growing popularity of user-generated content like blogs, tweets, and wikis. This paper introduces a new source of high quality manual annotations that can be used for sentiment extraction. A subfield of sociology symbolic interactionism, more precisely Affect Control Theory (ACT), measures the emotional meanings we associate with various concepts. Research in this field produces multi-dimensional manual annotations of words much like those used in Sentiment Analysis. We compare these annotations with SentiWordNet and WordNet-Affect, lexicons produced for Sentiment Analysis, in the task of text polarity classification and show that classifier trained on the ACT lexicon outperforms the other two

    MoodyLyrics: A Sentiment Annotated Lyrics Dataset

    Get PDF
    Music emotion recognition and recommendations today are changing the way people find and listen to their preferred musical tracks. Emotion recognition of songs is mostly based on feature extraction and learning from available datasets. In this work we take a different approach utilizing content words of lyrics and their valence and arousal norms in affect lexicons only. We use this method to annotate each song with one of the four emotion categories of Russell's model, and also to construct MoodyLyrics, a large dataset of lyrics that will be available for public use. For evaluation we utilized another lyrics dataset as ground truth and achieved an accuracy of 74.25 %. Our results confirm that valence is a better discriminator of mood than arousal. The results also prove that music mood recognition or annotation can be achieved with good accuracy even without subjective human feedback or user tags, when they are not available

    Idiom–based features in sentiment analysis: cutting the Gordian knot

    Get PDF
    In this paper we describe an automated approach to enriching sentiment analysis with idiom–based features. Specifically, we automated the development of the supporting lexico–semantic resources, which include (1) a set of rules used to identify idioms in text and (2) their sentiment polarity classifications. Our method demonstrates how idiom dictionaries, which are readily available general pedagogical resources, can be adapted into purpose–specific computational resources automatically. These resources were then used to replace the manually engineered counterparts in an existing system, which originally outperformed the baseline sentiment analysis approaches by 17 percentage points on average, taking the F–measure from 40s into 60s. The new fully automated approach outperformed the baselines by 8 percentage points on average taking the F–measure from 40s into 50s. Although the latter improvement is not as high as the one achieved with the manually engineered features, it has got the advantage of being more general in a sense that it can readily utilize an arbitrary list of idioms without the knowledge acquisition overhead previously associated with this task, thereby fully automating the original approach

    Extending a Fuzzy Polarity Propagation Method for Multi-Domain Sentiment Analysis with Word Embedding and POS Tagging

    Get PDF
    International audienceWithin multi-domain sentiment analysis, we study how different domain-dependent polarities can be learned for the same concepts. To this aim, we extend an existing approach based on the propagation of fuzzy polarities over a semantic graph capturing background linguistic knowledge to learn concept polarities with respect to various domains and their uncertainty from labeled datasets. In particular, we use POS tagging to refine the association between terms and concepts and word embedding to enhance the construction of the semantic graph. The proposed approach is then evaluated on a standard benchmark, showing that the combined use of POS tagging and word embedding improves its performance. One particularly strong point of the proposed approach is its recall, which is always very close to 100%. In addition, we observe that it exhibits good cross-domain generalization capabilities
    corecore