757 research outputs found

    Predictive Features in Semi-Supervised Learning for Polarity Classification and the Role of Adjectives

    Get PDF
    Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kristiina Jokinen and Eckhard Bick. NEALT Proceedings Series, Vol. 4 (2009), 198-205. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9206

    Approaching Sentiment Analysis by Using Semi-supervised Learning of Multidimensional Classifiers

    Get PDF
    Sentiment Analysis is defined as the computational study of opinions, sentiments and emotions expressed in text. Within this broad field, most of the work has been focused on either Sentiment Polarity classification, where a text is classified as having positive or negative sentiment, or Subjectivity classification, in which a text is classified as being subjective or objective. However, in this paper, we consider instead a real-world problem in which the attitude of the author is characterised by three different (but related) target variables: Subjectivity, Sentiment Polarity, Will to Influence, unlike the two previously stated problems, where there is only a single variable to be predicted. For that reason, the (uni-dimensional) common approaches used in this area yield suboptimal solutions to this problem. In order to bridge this gap, we propose, for the first time, the use of the novel multi-dimensional classification paradigm in the Sentiment Analysis domain. This methodology is able to join the different target variables in the same classification task so as to take advantage of the potential statistical relations between them. In addition, and in order to take advantage of the huge amount of unlabelled information available nowadays in this context, we propose the extension of the multi-dimensional classification framework to the semi-supervised domain. Experimental results for this problem show that our semi-supervised multi-dimensional approach outperforms the most common Sentiment Analysis approaches, concluding that our approach is beneficial to improve the recognition rates for this problem, and in extension, could be considered to solve future Sentiment Analysis problems

    Sentiment analysis of political tweets: towards an accurate classifier

    Get PDF
    We perform a series of 3-class sentiment classification experiments on a set of 2,624 tweets produced during the run-up to the Irish General Elections in February 2011. Even though tweets that have been labelled as sarcastic have been omitted from this set, it still represents a difficult test set and the highest accuracy we achieve is 61.6% using supervised learning and a feature set consisting of subjectivity-lexicon-based scores, Twitter- specific features and the top 1,000 most dis- criminative words. This is superior to various naive unsupervised approaches which use subjectivity lexicons to compute an overall sentiment score for a pair

    Cross-domain sentiment classification using a sentiment sensitive thesaurus

    Get PDF
    Automatic classification of sentiment is important for numerous applications such as opinion mining, opinion summarization, contextual advertising, and market analysis. However, sentiment is expressed differently in different domains, and annotating corpora for every possible domain of interest is costly. Applying a sentiment classifier trained using labeled data for a particular domain to classify sentiment of user reviews on a different domain often results in poor performance. We propose a method to overcome this problem in cross-domain sentiment classification. First, we create a sentiment sensitive distributional thesaurus using labeled data for the source domains and unlabeled data for both source and target domains. Sentiment sensitivity is achieved in the thesaurus by incorporating document level sentiment labels in the context vectors used as the basis for measuring the distributional similarity between words. Next, we use the created thesaurus to expand feature vectors during train and test times in a binary classifier. The proposed method significantly outperforms numerous baselines and returns results that are comparable with previously proposed cross-domain sentiment classification methods. We conduct an extensive empirical analysis of the proposed method on single and multi-source domain adaptation, unsupervised and supervised domain adaptation, and numerous similarity measures for creating the sentiment sensitive thesaurus

    Opinion mining with the SentWordNet lexical resource

    Get PDF
    Sentiment classification concerns the application of automatic methods for predicting the orientation of sentiment present on text documents. It is an important subject in opinion mining research, with applications on a number of areas including recommender and advertising systems, customer intelligence and information retrieval. SentiWordNet is a lexical resource of sentiment information for terms in the English language designed to assist in opinion mining tasks, where each term is associated with numerical scores for positive and negative sentiment information. A resource that makes term level sentiment information readily available could be of use in building more effective sentiment classification methods. This research presents the results of an experiment that applied the SentiWordNet lexical resource to the problem of automatic sentiment classification of film reviews. First, a data set of relevant features extracted from text documents using SentiWordNet was designed and implemented. The resulting feature set is then used as input for training a support vector machine classifier for predicting the sentiment orientation of the underlying film review. Several scenarios exploring variations on the parameters that generate the data set, outlier removal and feature selection were executed. The results obtained are compared to other methods documented in the literature. It was found that they are in line with other experiments that propose similar approaches and use the same data set of film reviews, indicating SentiWordNet could become an important resource for the task of sentiment classification. Considerations on future improvements are also presented based on a detailed analysis of classification results

    Sentiment-Based Assessment Of Electronic Mixed-Motive Communication - A Comparison Of Approaches

    Get PDF
    In this paper, we seek to analyse specific types of bilateral electronic communication processes, namely such processes where there is a distinction between individual goals of the communicating parties and their joint goals. We argue that there exists a distinction between successful and unsuccessful processes. This distinction is manifest in the communication patterns used by the participants. Sentiment analysis can enable researchers to identify these distinctions automatically, based on a classification model previously trained for the exact type of communication process. This paper discusses an adaption of sentiment-based techniques for the domain of electronic business negotiations

    Doctor of Philosophy

    Get PDF
    dissertationThe theme of my dissertation is users' opinion learning. We propose three different studies to learn users' opinion using various approaches and to address several important research questions. Firstly, in order to discover the significant factors that induce the rating differences from user-generated reviews, we first extract possible specific influences from the review, known as aspects, and then we propose an unsupervised aspect-based sentiment learning system that assigns sentiment scores to potential aspects. Based on the sentiment scores, we adopt linear regression models to identify the aspects that lead to the rating differences. Food quality, service, dessert and drink quality, location, value, and general opinion toward the restaurants are recognized as the main influential factors that cause the Yelp rating differences among chain restaurants. Secondly, to understand the impact of time reminder designs such as counting down clock, progressing bar indicator, and remaining number of advertisements reminder embedded in specific long and short advertisement videos, we propose a 4 by 2 between-subject experimental study with follow-up survey questions to collect user's opinions toward different temporal designs in the video. Thirdly, our study analyzes the advertisement video designs from the content level. We design the advertisement video with high and low content relevance levels with the desired video. A 2 by 2 betweensubject experimental study with follow-up survey questions is proposed. Results point out that advertisement videos with high content relevance levels can lead to shorter video iv duration perception and less negative attitudes toward the video, but can also diminish the effectiveness of the advertisement with users recalling fewer products and brands promoted in both longer and shorter advertisement videos
    corecore