1,505 research outputs found
Joint Syntacto-Discourse Parsing and the Syntacto-Discourse Treebank
Discourse parsing has long been treated as a stand-alone problem independent
from constituency or dependency parsing. Most attempts at this problem are
pipelined rather than end-to-end, sophisticated, and not self-contained: they
assume gold-standard text segmentations (Elementary Discourse Units), and use
external parsers for syntactic features. In this paper we propose the first
end-to-end discourse parser that jointly parses in both syntax and discourse
levels, as well as the first syntacto-discourse treebank by integrating the
Penn Treebank with the RST Treebank. Built upon our recent span-based
constituency parser, this joint syntacto-discourse parser requires no
preprocessing whatsoever (such as segmentation or feature extraction), achieves
the state-of-the-art end-to-end discourse parsing accuracy.Comment: Accepted at EMNLP 201
A PDTB-Styled End-to-End Discourse Parser
We have developed a full discourse parser in the Penn Discourse Treebank
(PDTB) style. Our trained parser first identifies all discourse and
non-discourse relations, locates and labels their arguments, and then
classifies their relation types. When appropriate, the attribution spans to
these relations are also determined. We present a comprehensive evaluation from
both component-wise and error-cascading perspectives.Comment: 15 pages, 5 figures, 7 table
Attribution and the (Non-)Alignment of Syntactic and Discourse Arguments of Connectives
The annotations of the Penn Discourse Treebank (PDTB) include (1) discourse connectives and their arguments, and (2) attribution of each argument of each connective and of the relation it denotes. Because the PDTB covers the same text as the Penn TreeBank WSJ corpus, syntactic and discourse annotation can be compared. This has revealed significant differences between syntactic structure and discourse structure, in terms of the arguments of connectives, due in large part to attribution. We describe these differences, an algorithm for detecting them, and finally some experimental results. These results have implications for automating discourse annotation based on syntactic annotation.
Vers le FDTB : French Discourse Tree Bank
National audienceWe present the first steps towards creating an annotated corpus for discourse in French : the French Discourse Treebank enriching the FTB. Our methodology is based on the Penn Discourse Treebank (PDTB), but it differs in at least two points of a theoretical nature. First, our goal is to provide full coverage of a text, while the PDTB provides only partial coverage, which can not be described as discourse analysis such as the one made in RST or SDRT, two major theories on discourse. Second, we were led to define a new hierarchy of discourse relations which is based on RST, SDRT and PDTB.Nous présentons les premiers pas vers la création d'un corpus annoté en discours pour le français : le French Discourse TreeBank enrichissant le FTB. La méthodologie adoptée s'inspire du Penn Discourse TreeBank (PDTB) mais elle s'en distingue sur au moins deux points à caractère théorique. D'abord, notre objectif est de fournir une couverture totale d'un texte du corpus, tandis que le PDTB ne fournit qu'une couverture partielle, qui ne peut donc pas être qualifiée d'analyse discursive comme celle faite en RST ou SDRT, deux théories majeures sur le discours. Ensuite, nous avons été amenés à définir une nouvelle hiérarchie des relations de discours qui s'inspire de RST, de SDRT et du PDTB
- …