18 research outputs found
Joint Syntacto-Discourse Parsing and the Syntacto-Discourse Treebank
Discourse parsing has long been treated as a stand-alone problem independent
from constituency or dependency parsing. Most attempts at this problem are
pipelined rather than end-to-end, sophisticated, and not self-contained: they
assume gold-standard text segmentations (Elementary Discourse Units), and use
external parsers for syntactic features. In this paper we propose the first
end-to-end discourse parser that jointly parses in both syntax and discourse
levels, as well as the first syntacto-discourse treebank by integrating the
Penn Treebank with the RST Treebank. Built upon our recent span-based
constituency parser, this joint syntacto-discourse parser requires no
preprocessing whatsoever (such as segmentation or feature extraction), achieves
the state-of-the-art end-to-end discourse parsing accuracy.Comment: Accepted at EMNLP 201
The distribution of discourse relations within and across turns in spontaneous conversation
Time pressure and topic negotiation may impose constraints on how people
leverage discourse relations (DRs) in spontaneous conversational contexts. In
this work, we adapt a system of DRs for written language to spontaneous
dialogue using crowdsourced annotations from novice annotators. We then test
whether discourse relations are used differently across several types of
multi-utterance contexts. We compare the patterns of DR annotation within and
across speakers and within and across turns. Ultimately, we find that different
discourse contexts produce distinct distributions of discourse relations, with
single-turn annotations creating the most uncertainty for annotators.
Additionally, we find that the discourse relation annotations are of sufficient
quality to predict from embeddings of discourse units.Comment: Proceedings of Computational Approaches to Discourse 2023, collocated
with the 2023 meeting of the Association for Computational Linguistics,
Toronto, Canad
On the Importance of Word and Sentence Representation Learning in Implicit Discourse Relation Classification
Implicit discourse relation classification is one of the most difficult parts
in shallow discourse parsing as the relation prediction without explicit
connectives requires the language understanding at both the text span level and
the sentence level. Previous studies mainly focus on the interactions between
two arguments. We argue that a powerful contextualized representation module, a
bilateral multi-perspective matching module, and a global information fusion
module are all important to implicit discourse analysis. We propose a novel
model to combine these modules together. Extensive experiments show that our
proposed model outperforms BERT and other state-of-the-art systems on the PDTB
dataset by around 8% and CoNLL 2016 datasets around 16%. We also analyze the
effectiveness of different modules in the implicit discourse relation
classification task and demonstrate how different levels of representation
learning can affect the results.Comment: Accepted by IJCAI 202