232 research outputs found
AN APPROACH TO SENTIMENT ANALYSIS –THE CASE OF AIRLINE QUALITY RATING
Sentiment mining has been commonly associated with the analysis of a text string to determine whether a corpus is of a negative or positive opinion. Recently, sentiment mining has been extended to address problems such as distinguishing objective from subjective propositions, and determining the sources and topics of different opinions expressed in textual data sets such as web blogs, tweets, message board reviews, and news. Companies can leverage opinion polarity and sentiment topic recognition to gain a deeper understanding of the drivers and the overall scope of sentiments. These insights can advance competitive intelligence, improve customer service, attain better brand image, and enhance competitiveness. This research paper proposes a sentiment mining approach which detects sentiment polarity and sentiment topic from text. The approach includes a sentiment topic recognition model that is based on Correlated Topics Models (CTM) with Variational Expectation-Maximization (VEM) algorithm. We validate the effectiveness and efficiency of this model using airline data from Twitter. We also examine the reputation of three major airlines by computing their Airline Quality Rating (AQR) based on the output from our approach
Personalized Expert Recommendation: Models and Algorithms
Many large-scale information sharing systems including social media systems, questionanswering
sites and rating and reviewing applications have been growing rapidly, allowing
millions of human participants to generate and consume information on an unprecedented
scale. To manage the sheer growth of information generation, there comes the need to enable
personalization of information resources for users — to surface high-quality content
and feeds, to provide personally relevant suggestions, and so on. A fundamental task in
creating and supporting user-centered personalization systems is to build rich user profile
to aid recommendation for better user experience.
Therefore, in this dissertation research, we propose models and algorithms to facilitate
the creation of new crowd-powered personalized information sharing systems. Specifically,
we first give a principled framework to enable personalization of resources so that
information seekers can be matched with customized knowledgeable users based on their
previous historical actions and contextual information; We then focus on creating rich
user models that allows accurate and comprehensive modeling of user profiles for long
tail users, including discovering user’s known-for profile, user’s opinion bias and user’s
geo-topic profile. In particular, this dissertation research makes two unique contributions:
First, we introduce the problem of personalized expert recommendation and propose
the first principled framework for addressing this problem. To overcome the sparsity issue,
we investigate the use of user’s contextual information that can be exploited to build robust
models of personal expertise, study how spatial preference for personally-valuable expertise
varies across regions, across topics and based on different underlying social communities,
and integrate these different forms of preferences into a matrix factorization-based
personalized expert recommender.
Second, to support the personalized recommendation on experts, we focus on modeling
and inferring user profiles in online information sharing systems. In order to tap
the knowledge of most majority of users, we provide frameworks and algorithms to accurately
and comprehensively create user models by discovering user’s known-for profile,
user’s opinion bias and user’s geo-topic profile, with each described shortly as follows:
—We develop a probabilistic model called Bayesian Contextual Poisson Factorization
to discover what users are known for by others. Our model considers as input a small fraction
of users whose known-for profiles are already known and the vast majority of users for
whom we have little (or no) information, learns the implicit relationships between user?s
known-for profiles and their contextual signals, and finally predict known-for profiles for
those majority of users.
—We explore user’s topic-sensitive opinion bias, propose a lightweight semi-supervised
system called “BiasWatch” to semi-automatically infer the opinion bias of long-tail users,
and demonstrate how user’s opinion bias can be exploited to recommend other users with
similar opinion in social networks.
— We study how a user’s topical profile varies geo-spatially and how we can model
a user’s geo-spatial known-for profile as the last step in our dissertation for creation of
rich user profile. We propose a multi-layered Bayesian hierarchical user factorization to
overcome user heterogeneity and an enhanced model to alleviate the sparsity issue by integrating
user contexts into the two-layered hierarchical user model for better representation
of user’s geo-topic preference by others
SSentiaA: A Self-Supervised Sentiment Analyzer for Classification From Unlabeled Data
In recent years, supervised machine learning (ML) methods have realized remarkable performance gains for sentiment classification utilizing labeled data. However, labeled data are usually expensive to obtain, thus, not always achievable. When annotated data are unavailable, the unsupervised tools are exercised, which still lag behind the performance of supervised ML methods by a large margin. Therefore, in this work, we focus on improving the performance of sentiment classification from unlabeled data. We present a self-supervised hybrid methodology SSentiA (Self-supervised Sentiment Analyzer) that couples an ML classifier with a lexicon-based method for sentiment classification from unlabeled data. We first introduce LRSentiA (Lexical Rule-based Sentiment Analyzer), a lexicon-based method to predict the semantic orientation of a review along with the confidence score of prediction. Utilizing the confidence scores of LRSentiA, we generate highly accurate pseudo-labels for SSentiA that incorporates a supervised ML algorithm to improve the performance of sentiment classification for less polarized and complex reviews. We compare the performances of LRSentiA and SSSentA with the existing unsupervised, lexicon-based and self-supervised methods in multiple datasets. The LRSentiA performs similarly to the existing lexicon-based methods in both binary and 3-class sentiment analysis. By combining LRSentiA with an ML classifier, the hybrid approach SSentiA attains 10%–30% improvements in macro F1 score for both binary and 3-class sentiment analysis. The results suggest that in domains where annotated data are unavailable, SSentiA can significantly improve the performance of sentiment classification. Moreover, we demonstrate that using 30%–60% annotated training data, SSentiA delivers similar performances of the fully labeled training dataset
- …