1,185 research outputs found
Better Document-level Sentiment Analysis from RST Discourse Parsing
Discourse structure is the hidden link between surface features and
document-level properties, such as sentiment polarity. We show that the
discourse analyses produced by Rhetorical Structure Theory (RST) parsers can
improve document-level sentiment analysis, via composition of local information
up the discourse tree. First, we show that reweighting discourse units
according to their position in a dependency representation of the rhetorical
structure can yield substantial improvements on lexicon-based sentiment
analysis. Next, we present a recursive neural network over the RST structure,
which offers significant improvements over classification-based methods.Comment: Published at Empirical Methods in Natural Language Processing (EMNLP
2015
Neural Discourse Structure for Text Categorization
We show that discourse structure, as defined by Rhetorical Structure Theory
and provided by an existing discourse parser, benefits text categorization. Our
approach uses a recursive neural network and a newly proposed attention
mechanism to compute a representation of the text that focuses on salient
content, from the perspective of both RST and the task. Experiments consider
variants of the approach and illustrate its strengths and weaknesses.Comment: ACL 2017 camera ready versio
Cross-lingual RST Discourse Parsing
Discourse parsing is an integral part of understanding information flow and
argumentative structure in documents. Most previous research has focused on
inducing and evaluating models from the English RST Discourse Treebank.
However, discourse treebanks for other languages exist, including Spanish,
German, Basque, Dutch and Brazilian Portuguese. The treebanks share the same
underlying linguistic theory, but differ slightly in the way documents are
annotated. In this paper, we present (a) a new discourse parser which is
simpler, yet competitive (significantly better on 2/3 metrics) to state of the
art for English, (b) a harmonization of discourse treebanks across languages,
enabling us to present (c) what to the best of our knowledge are the first
experiments on cross-lingual discourse parsing.Comment: To be published in EACL 2017, 13 page
Joint Syntacto-Discourse Parsing and the Syntacto-Discourse Treebank
Discourse parsing has long been treated as a stand-alone problem independent
from constituency or dependency parsing. Most attempts at this problem are
pipelined rather than end-to-end, sophisticated, and not self-contained: they
assume gold-standard text segmentations (Elementary Discourse Units), and use
external parsers for syntactic features. In this paper we propose the first
end-to-end discourse parser that jointly parses in both syntax and discourse
levels, as well as the first syntacto-discourse treebank by integrating the
Penn Treebank with the RST Treebank. Built upon our recent span-based
constituency parser, this joint syntacto-discourse parser requires no
preprocessing whatsoever (such as segmentation or feature extraction), achieves
the state-of-the-art end-to-end discourse parsing accuracy.Comment: Accepted at EMNLP 201
- …