10,488 research outputs found
Comprehensive Review of Opinion Summarization
The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe
Attentive Aspect Modeling for Review-aware Recommendation
In recent years, many studies extract aspects from user reviews and integrate
them with ratings for improving the recommendation performance. The common
aspects mentioned in a user's reviews and a product's reviews indicate indirect
connections between the user and product. However, these aspect-based methods
suffer from two problems. First, the common aspects are usually very sparse,
which is caused by the sparsity of user-product interactions and the diversity
of individual users' vocabularies. Second, a user's interests on aspects could
be different with respect to different products, which are usually assumed to
be static in existing methods. In this paper, we propose an Attentive
Aspect-based Recommendation Model (AARM) to tackle these challenges. For the
first problem, to enrich the aspect connections between user and product,
besides common aspects, AARM also models the interactions between synonymous
and similar aspects. For the second problem, a neural attention network which
simultaneously considers user, product and aspect information is constructed to
capture a user's attention towards aspects when examining different products.
Extensive quantitative and qualitative experiments show that AARM can
effectively alleviate the two aforementioned problems and significantly
outperforms several state-of-the-art recommendation methods on top-N
recommendation task.Comment: Camera-ready manuscript for TOI
Automatically detecting open academic review praise and criticism
This is an accepted manuscript of an article published by Emerald in Online Information Review on 15 June 2020.
The accepted version of the publication may differ from the final published version, accessible at https://doi.org/10.1108/OIR-11-2019-0347.Purpose: Peer reviewer evaluations of academic papers are known to be variable in content and overall judgements but are important academic publishing safeguards. This article introduces a sentiment analysis program, PeerJudge, to detect praise and criticism in peer evaluations. It is designed to support editorial management decisions and reviewers in the scholarly publishing process and for grant funding decision workflows. The initial version of PeerJudge is tailored for reviews from F1000Research’s open peer review publishing platform.
Design/methodology/approach: PeerJudge uses a lexical sentiment analysis approach with a human-coded initial sentiment lexicon and machine learning adjustments and additions. It was built with an F1000Research development corpus and evaluated on a different F1000Research test corpus using reviewer ratings.
Findings: PeerJudge can predict F1000Research judgements from negative evaluations in reviewers’ comments more accurately than baseline approaches, although not from positive reviewer comments, which seem to be largely unrelated to reviewer decisions. Within the F1000Research mode of post-publication peer review, the absence of any detected negative comments is a reliable indicator that an article will be ‘approved’, but the presence of moderately negative comments could lead to either an approved or approved with reservations decision.
Originality/value: PeerJudge is the first transparent AI approach to peer review sentiment detection. It may be used to identify anomalous reviews with text potentially not matching judgements for individual checks or systematic bias assessments
Statistical Inferences for Polarity Identification in Natural Language
Information forms the basis for all human behavior, including the ubiquitous
decision-making that people constantly perform in their every day lives. It is
thus the mission of researchers to understand how humans process information to
reach decisions. In order to facilitate this task, this work proposes a novel
method of studying the reception of granular expressions in natural language.
The approach utilizes LASSO regularization as a statistical tool to extract
decisive words from textual content and draw statistical inferences based on
the correspondence between the occurrences of words and an exogenous response
variable. Accordingly, the method immediately suggests significant implications
for social sciences and Information Systems research: everyone can now identify
text segments and word choices that are statistically relevant to authors or
readers and, based on this knowledge, test hypotheses from behavioral research.
We demonstrate the contribution of our method by examining how authors
communicate subjective information through narrative materials. This allows us
to answer the question of which words to choose when communicating negative
information. On the other hand, we show that investors trade not only upon
facts in financial disclosures but are distracted by filler words and
non-informative language. Practitioners - for example those in the fields of
investor communications or marketing - can exploit our insights to enhance
their writings based on the true perception of word choice
Multinomial Inverse Regression for Text Analysis
Text data, including speeches, stories, and other document forms, are often
connected to sentiment variables that are of interest for research in
marketing, economics, and elsewhere. It is also very high dimensional and
difficult to incorporate into statistical analyses. This article introduces a
straightforward framework of sentiment-preserving dimension reduction for text
data. Multinomial inverse regression is introduced as a general tool for
simplifying predictor sets that can be represented as draws from a multinomial
distribution, and we show that logistic regression of phrase counts onto
document annotations can be used to obtain low dimension document
representations that are rich in sentiment information. To facilitate this
modeling, a novel estimation technique is developed for multinomial logistic
regression with very high-dimension response. In particular, independent
Laplace priors with unknown variance are assigned to each regression
coefficient, and we detail an efficient routine for maximization of the joint
posterior over coefficients and their prior scale. This "gamma-lasso" scheme
yields stable and effective estimation for general high-dimension logistic
regression, and we argue that it will be superior to current methods in many
settings. Guidelines for prior specification are provided, algorithm
convergence is detailed, and estimator properties are outlined from the
perspective of the literature on non-concave likelihood penalization. Related
work on sentiment analysis from statistics, econometrics, and machine learning
is surveyed and connected. Finally, the methods are applied in two detailed
examples and we provide out-of-sample prediction studies to illustrate their
effectiveness.Comment: Published in the Journal of the American Statistical Association 108,
2013, with discussion (rejoinder is here: http://arxiv.org/abs/1304.4200).
Software is available in the textir package for
- …