1,505 research outputs found

    Joint Syntacto-Discourse Parsing and the Syntacto-Discourse Treebank

    Full text link
    Discourse parsing has long been treated as a stand-alone problem independent from constituency or dependency parsing. Most attempts at this problem are pipelined rather than end-to-end, sophisticated, and not self-contained: they assume gold-standard text segmentations (Elementary Discourse Units), and use external parsers for syntactic features. In this paper we propose the first end-to-end discourse parser that jointly parses in both syntax and discourse levels, as well as the first syntacto-discourse treebank by integrating the Penn Treebank with the RST Treebank. Built upon our recent span-based constituency parser, this joint syntacto-discourse parser requires no preprocessing whatsoever (such as segmentation or feature extraction), achieves the state-of-the-art end-to-end discourse parsing accuracy.Comment: Accepted at EMNLP 201

    A PDTB-Styled End-to-End Discourse Parser

    Full text link
    We have developed a full discourse parser in the Penn Discourse Treebank (PDTB) style. Our trained parser first identifies all discourse and non-discourse relations, locates and labels their arguments, and then classifies their relation types. When appropriate, the attribution spans to these relations are also determined. We present a comprehensive evaluation from both component-wise and error-cascading perspectives.Comment: 15 pages, 5 figures, 7 table

    Attribution and the (Non-)Alignment of Syntactic and Discourse Arguments of Connectives

    Get PDF
    The annotations of the Penn Discourse Treebank (PDTB) include (1) discourse connectives and their arguments, and (2) attribution of each argument of each connective and of the relation it denotes. Because the PDTB covers the same text as the Penn TreeBank WSJ corpus, syntactic and discourse annotation can be compared. This has revealed significant differences between syntactic structure and discourse structure, in terms of the arguments of connectives, due in large part to attribution. We describe these differences, an algorithm for detecting them, and finally some experimental results. These results have implications for automating discourse annotation based on syntactic annotation.

    Vers le FDTB : French Discourse Tree Bank

    Get PDF
    National audienceWe present the first steps towards creating an annotated corpus for discourse in French : the French Discourse Treebank enriching the FTB. Our methodology is based on the Penn Discourse Treebank (PDTB), but it differs in at least two points of a theoretical nature. First, our goal is to provide full coverage of a text, while the PDTB provides only partial coverage, which can not be described as discourse analysis such as the one made in RST or SDRT, two major theories on discourse. Second, we were led to define a new hierarchy of discourse relations which is based on RST, SDRT and PDTB.Nous présentons les premiers pas vers la création d'un corpus annoté en discours pour le français : le French Discourse TreeBank enrichissant le FTB. La méthodologie adoptée s'inspire du Penn Discourse TreeBank (PDTB) mais elle s'en distingue sur au moins deux points à caractère théorique. D'abord, notre objectif est de fournir une couverture totale d'un texte du corpus, tandis que le PDTB ne fournit qu'une couverture partielle, qui ne peut donc pas être qualifiée d'analyse discursive comme celle faite en RST ou SDRT, deux théories majeures sur le discours. Ensuite, nous avons été amenés à définir une nouvelle hiérarchie des relations de discours qui s'inspire de RST, de SDRT et du PDTB
    corecore