2 research outputs found
Unifying Topic, Sentiment & Preference in an HDP-Based Rating Regression Model for Online Reviews
This paper proposes a new HDP based online review rating regression model
named Topic-Sentiment-Preference Regression Analysis (TSPRA). TSPRA combines
topics (i.e. product aspects), word sentiment and user preference as regression
factors, and is able to perform topic clustering, review rating prediction,
sentiment analysis and what we invent as "critical aspect" analysis altogether
in one framework. TSPRA extends sentiment approaches by integrating the key
concept "user preference" in collaborative filtering (CF) models into
consideration, while it is distinct from current CF models by decoupling "user
preference" and "sentiment" as independent factors. Our experiments conducted
on 22 Amazon datasets show overwhelming better performance in rating
predication against a state-of-art model FLAME (2015) in terms of error,
Pearson's Correlation and number of inverted pairs. For sentiment analysis, we
compare the derived word sentiments against a public sentiment resource
SenticNet3 and our sentiment estimations clearly make more sense in the context
of online reviews. Last, as a result of the de-correlation of "user preference"
from "sentiment", TSPRA is able to evaluate a new concept "critical aspects",
defined as the product aspects seriously concerned by users but negatively
commented in reviews. Improvement to such "critical aspects" could be most
effective to enhance user experience
Large-Scale Joint Topic, Sentiment & User Preference Analysis for Online Reviews
This paper presents a non-trivial reconstruction of a previous joint
topic-sentiment-preference review model TSPRA with stick-breaking
representation under the framework of variational inference (VI) and stochastic
variational inference (SVI). TSPRA is a Gibbs Sampling based model that solves
topics, word sentiments and user preferences altogether and has been shown to
achieve good performance, but for large data set it can only learn from a
relatively small sample. We develop the variational models vTSPRA and svTSPRA
to improve the time use, and our new approach is capable of processing millions
of reviews. We rebuild the generative process, improve the rating regression,
solve and present the coordinate-ascent updates of variational parameters, and
show the time complexity of each iteration is theoretically linear to the
corpus size, and the experiments on Amazon data sets show it converges faster
than TSPRA and attains better results given the same amount of time. In
addition, we tune svTSPRA into an online algorithm ovTSPRA that can monitor
oscillations of sentiment and preference overtime. Some interesting
fluctuations are captured and possible explanations are provided. The results
give strong visual evidence that user preference is better treated as an
independent factor from sentiment