13 research outputs found

    pyBART: Evidence-based Syntactic Transformations for IE

    Full text link
    Syntactic dependencies can be predicted with high accuracy, and are useful for both machine-learned and pattern-based information extraction tasks. However, their utility can be improved. These syntactic dependencies are designed to accurately reflect syntactic relations, and they do not make semantic relations explicit. Therefore, these representations lack many explicit connections between content words, that would be useful for downstream applications. Proposals like English Enhanced UD improve the situation by extending universal dependency trees with additional explicit arcs. However, they are not available to Python users, and are also limited in coverage. We introduce a broad-coverage, data-driven and linguistically sound set of transformations, that makes event-structure and many lexical relations explicit. We present pyBART, an easy-to-use open-source Python library for converting English UD trees either to Enhanced UD graphs or to our representation. The library can work as a standalone package or be integrated within a spaCy NLP pipeline. When evaluated in a pattern-based relation extraction scenario, our representation results in higher extraction scores than Enhanced UD, while requiring fewer patterns.Comment: Accepted ACL2020 system demonstration pape

    Paris and Stanford at EPE 2017: Downstream Evaluation of Graph-based Dependency Representations

    Get PDF
    International audienceWe describe the STANFORD-PARIS and PARIS-STANFORD submissions to the 2017 Extrinsic Parser Evaluation (EPE) Shared Task. The purpose of this shared task was to evaluate dependency graphs on three downstream tasks. Through our submissions, we evaluated the usability of several representations derived from English Universal Dependencies (UD), as well as the Stanford Dependencies (SD), Predicate Argument Structure (PAS), and DM representations. We further compared two parsing strategies: Directly parsing to graph-based dependency representations and a two-stage process of first parsing to surface syntax trees and then applying rule-based augmentations to obtain the final graphs. Overall, our systems performed very well and our submissions ranked first and third. In our analysis, we find that the two-stage parsing process leads to better downstream performance, and that enhanced UD, a graph-based representation, consistently outperforms basic UD, a strict surface syntax representation, suggesting an advantage of enriched representations for downstream tasks

    From raw text to enhanced universal dependencies:The parsing shared task at IWPT 2021

    Get PDF
    We describe the second IWPT task on end-to-end parsing from raw text to Enhanced Universal Dependencies. We provide details about the evaluation metrics and the datasets used for training and evaluation. We compare the approaches taken by participating teams and discuss the results of the shared task, also in comparison with the first edition of this task

    A corpus study of creating rule-based enhanced universal dependencies for German

    Get PDF
    In this paper, we present a first attempt at enriching German Universal Dependencies (UD) treebanks with enhanced dependencies. Similarly to the converter for English (Schuster and Manning, 2016), we develop a rule-based system for deriving enhanced dependencies from the basic layer, covering three linguistic phenomena: relative clauses, coordination, and raising/control. For quality control, we manually correct or validate a set of 196 sentences, finding that around 90% of added relations are correct. Our data analysis reveals that difficulties arise mainly due to inconsistencies in the basic layer annotations. We show that the English system is in general applicable to German data, but that adapting to the particularities of the German treebanks and language increases precision and recall by up to 10%. Comparing the application of our converter on gold standard dependencies vs. automatic parses, we find that F1 drops by around 10% in the latter setting due to error propagation. Finally, an enhanced UD parser trained on a converted treebank performs poorly when evaluated against our annotations, indicating that more work remains to be done to create gold standard enhanced German treebanks

    Overview of the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies

    Get PDF
    This overview introduces the task of parsing into enhanced universal dependencies, describes the datasets used for training and evaluation, and evaluation metrics. We outline various approaches and discuss the results of the shared task

    Enhanced UD Dependencies with Neutralized Diathesis Alternation

    Get PDF
    International audienceThe 2.0 release of the Universal Dependency treebanks demonstrates the effectiveness of the UD scheme to cope with very diverse languages. The next step would be to get more of syntactic analysis , and the " enhanced dependencies " sketched in the UD 2.0 guidelines is a promising attempt in that direction. In this work we propose to go further and enrich the enhanced dependency scheme along two axis: extending the cases of recovered arguments of non-finite verbs, and neutralizing syntactic alternations. Doing so leads to both richer and more uniform structures, while remaining at the syntactic level, and thus rather neutral with respect to the type of semantic representation that can be further obtained. We implemented this proposal in two UD treebanks of French, using deterministic graph-rewriting rules. Evaluation on a 200 sentence gold standard shows that deep syntactic graphs can be obtained from surface syntax annotations with a high accuracy. Among all arguments of verbs in the gold standard, 13.91% are impacted by syntactic alternation normalization, and 18.93% are additional deep edges
    corecore