26 research outputs found
Better Document-level Sentiment Analysis from RST Discourse Parsing
Discourse structure is the hidden link between surface features and
document-level properties, such as sentiment polarity. We show that the
discourse analyses produced by Rhetorical Structure Theory (RST) parsers can
improve document-level sentiment analysis, via composition of local information
up the discourse tree. First, we show that reweighting discourse units
according to their position in a dependency representation of the rhetorical
structure can yield substantial improvements on lexicon-based sentiment
analysis. Next, we present a recursive neural network over the RST structure,
which offers significant improvements over classification-based methods.Comment: Published at Empirical Methods in Natural Language Processing (EMNLP
2015
Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks
We discuss the computational complexity of approximating maximum a posteriori
inference in sum-product networks. We first show NP-hardness in trees of height
two by a reduction from maximum independent set; this implies
non-approximability within a sublinear factor. We show that this is a tight
bound, as we can find an approximation within a linear factor in networks of
height two. We then show that, in trees of height three, it is NP-hard to
approximate the problem within a factor for any sublinear function
of the size of the input . Again, this bound is tight, as we prove that
the usual max-product algorithm finds (in any network) approximations within
factor for some constant . Last, we present a simple
algorithm, and show that it provably produces solutions at least as good as,
and potentially much better than, the max-product algorithm. We empirically
analyze the proposed algorithm against max-product using synthetic and
realistic networks.Comment: 18 page
Deep Memory Networks for Attitude Identification
We consider the task of identifying attitudes towards a given set of entities
from text. Conventionally, this task is decomposed into two separate subtasks:
target detection that identifies whether each entity is mentioned in the text,
either explicitly or implicitly, and polarity classification that classifies
the exact sentiment towards an identified entity (the target) into positive,
negative, or neutral.
Instead, we show that attitude identification can be solved with an
end-to-end machine learning architecture, in which the two subtasks are
interleaved by a deep memory network. In this way, signals produced in target
detection provide clues for polarity classification, and reversely, the
predicted polarity provides feedback to the identification of targets.
Moreover, the treatments for the set of targets also influence each other --
the learned representations may share the same semantics for some targets but
vary for others. The proposed deep memory network, the AttNet, outperforms
methods that do not consider the interactions between the subtasks or those
among the targets, including conventional machine learning methods and the
state-of-the-art deep learning models.Comment: Accepted to WSDM'1
Toward a Corpus of Cantonese Verbal Comments and their Classification by Multi-dimensional Analysis
The information explosion in modern days across various media calls for effective opinion mining for timely digestion of public views and appropriate follow-up actions. Current studies on sentiment analysis have primarily focused on uncovering aspects like subjectivity, sentiment and credibility from written data, while spoken data are less addressed. This paper reports on our pilot work on constructing a corpus of Cantonese verbal comments and making use of multi-dimensional analysis to characterise different opinion types therein. Preliminary findings on the dimensions identified and their association with various communicative functions are presented, with an outlook on their potential application in subjectivity analysis and opinion classification.
Opinion Holder and Target Extraction on Opinion Compounds – A Linguistic Approach
We present an approach to the new task of opinion holder and target extraction on opinion compounds. Opinion compounds (e.g. user rating or victim support) are noun compounds whose head is an opinion noun. We do not only examine features known to be effective for noun compound analysis, such as paraphrases and semantic classes of heads and modifiers, but also propose novel features tailored to this new task. Among them, we examine paraphrases that jointly consider holders and targets, a verb detour in which noun heads are replaced by related verbs, a global head constraint allowing inferencing between different compounds, and the categorization of the sentiment view that the head conveys
A machine-learning approach to negation and speculation detection for sentiment analysis
Recognizing negative and speculative information is highly relevant for sentiment analysis. This paper presents a machine-learning approach to automatically detect this kind of information in the review domain. The resulting system works in two steps: in the first pass, negation/speculation cues are identified, and in the second phase the full scope of these cues is determined. The system is trained and evaluated on the Simon Fraser University Review corpus, which is extensively used in opinion mining. The results show how the proposed method outstrips the baseline by as much as roughly 20% in the negation cue detection and around 13% in the scope recognition, both in terms of F1. In speculation, the performance obtained in the cue prediction phase is close to that obtained by a human rater carrying out the same task. In the scope detection, the results are also promising and represent a substantial improvement on the baseline (up by roughly 10%). A detailed error analysis is also provided. The extrinsic evaluation shows that the correct identification of cues and scopes is vital for the task of sentiment analysis.Maite Taboada from the Natural Sciences and Engineering Research Council of Canada (Discovery Grant 261104- 2008). This work was partly funded by the Spanish Ministry of Education and Science (TIN2009-14057-C03-03 Project) and the Andalusian Ministry of Economy, Innovation and Science (TIC 07629 and TIC 07684 Projects)