1,197 research outputs found

    Latent sentiment model for weakly-supervised cross-lingual sentiment classification

    No full text
    In this paper, we present a novel weakly-supervised method for crosslingual sentiment analysis. In specific, we propose a latent sentiment model (LSM) based on latent Dirichlet allocation where sentiment labels are considered as topics. Prior information extracted from English sentiment lexicons through machine translation are incorporated into LSM model learning, where preferences on expectations of sentiment labels of those lexicon words are expressed using generalized expectation criteria. An efficient parameter estimation procedure using variational Bayes is presented. Experimental results on the Chinese product reviews show that the weakly-supervised LSM model performs comparably to supervised classifiers such as Support vector Machines with an average of 81% accuracy achieved over a total of 5484 review documents. Moreover, starting with a generic sentiment lexicon, the LSM model is able to extract highly domainspecific polarity words from text

    Unsupervised and knowledge-poor approaches to sentiment analysis

    Get PDF
    Sentiment analysis focuses upon automatic classiffication of a document's sentiment (and more generally extraction of opinion from text). Ways of expressing sentiment have been shown to be dependent on what a document is about (domain-dependency). This complicates supervised methods for sentiment analysis which rely on extensive use of training data or linguistic resources that are usually either domain-specific or generic. Both kinds of resources prevent classiffiers from performing well across a range of domains, as this requires appropriate in-domain (domain-specific) data. This thesis presents a novel unsupervised, knowledge-poor approach to sentiment analysis aimed at creating a domain-independent and multilingual sentiment analysis system. The approach extracts domain-specific resources from documents that are to be processed, and uses them for sentiment analysis. This approach does not require any training corpora, large sets of rules or generic sentiment lexicons, which makes it domain- and languageindependent but at the same time able to utilise domain- and language-specific information. The thesis describes and tests the approach, which is applied to diffeerent data, including customer reviews of various types of products, reviews of films and books, and news items; and to four languages: Chinese, English, Russian and Japanese. The approach is applied not only to binary sentiment classiffication, but also to three-way sentiment classiffication (positive, negative and neutral), subjectivity classifiation of documents and sentences, and to the extraction of opinion holders and opinion targets. Experimental results suggest that the approach is often a viable alternative to supervised systems, especially when applied to large document collections

    SESS: A Self-Supervised and Syntax-Based Method for Sentiment Classification

    Get PDF
    PACLIC 23 / City University of Hong Kong / 3-5 December 200

    The Today Tendency of Sentiment Classification

    Get PDF
    Sentiment classification has already been studied for many years because it has had many crucial contributions to many different fields in everyday life, such as in political activities, commodity production, and commercial activities. There have been many kinds of the sentiment analysis such as machine learning approaches, lexicon-based approaches, etc., for many years. The today tendency of the sentiment classification is as follows: (1) Processing many big data sets with shortening execution times (2) Having a high accuracy (3) Integrating flexibly and easily into many small machines or many different approaches. We will present each category in more details

    Emotion and polarity prediction from Twitter

    Get PDF
    Classification of public information from microblogging and social networking services could yield interesting outcomes and insights into the social and public opinions towards different services, products, and events. Microblogging and social networking data are one of the most helpful and proper indicators of public opinion. The aim of this paper is to classify tweets to their classes using cross validation and partitioning the data across cities using supervised machine learning algorithms. Such an approach was used to collect real time Twitter microblogging data tweets towards mentioning iPad and iPhone in different locations in order to analyse and classify data in terms of polarity: positive or negative, and emotion: anger, joy, sadness, disgust, fear, and surprise. We have collected over eighty thousand tweets that have been pre-processed to generate document level ground-truth and labelled according to Emotion and Polarity. We also compared some approaches in order to measures the performance of K-NN, Nave Bayes, and SVM classifiers. We found that the K-NN, Nave Bayes, SVM, and ZeroR have a reasonable accuracy rates, however, the K-NN has outperformed the Nave Bayes, SVM, and ZeroR based on the achieved accuracy rates and trained model time. The K-NN has achieved the highest accuracy rates 96.58% and 99.94% for the iPad and iPhone emotion data sets using cross validation technique respectively. Regarding partitioning the data per city, the K-NN has achieved the highest accuracy rates 98.8% and 99.95% for the iPad and iPhone emotion data sets respectively. Regarding the polarity data sets using both cross validation and partitioning data per city, the K-NN achieved 100% for the all polarity datasets
    • …
    corecore