Search CORE

446 research outputs found

A Transition-Based Directed Acyclic Graph Parser for UCCA

Author: Abend Omri
Hershcovich Daniel
Rappoport Ari
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

We present the first parser for UCCA, a cross-linguistically applicable framework for semantic representation, which builds on extensive typological work and supports rapid annotation. UCCA poses a challenge for existing parsing techniques, as it exhibits reentrancy (resulting in DAG structures), discontinuous structures and non-terminal nodes corresponding to complex semantic units. To our knowledge, the conjunction of these formal properties is not supported by any existing parser. Our transition-based parser, which uses a novel transition set and features based on bidirectional LSTMs, has value not just for UCCA parsing: its ability to handle more general graph structures can inform the development of parsers for other semantic DAG structures, and in languages that frequently use discontinuous structures.Comment: 16 pages; Accepted as long paper at ACL201

arXiv.org e-Print Archive

Crossref

Une approche par boosting à la sélection de modèles pour l’analyse syntaxique statistique

Author: Bawden Rachel
Publication venue: HAL CCSD
Publication date: 26/06/2015
Field of study

International audienceIn this work we present our approach to model selection for statistical parsing via boosting. The method is used to target the inefficiency of current feature selection methods, in that it allows a constant feature selection time at each iteration rather than the increasing selection time of current standard forward wrapper methods. With the aim of performing feature selection on very high dimensional data, in particular for parsing morphologically rich languages, we test the approach, which uses the multiclass AdaBoost algorithm SAMME (Zhu et al., 2006), on French data from the French Treebank, using a multilingual discriminative constituency parser (Crabbé, 2014). Current results show that the method is indeed far more efficient than a naïve method, and the performance of the models produced is promising, with F-scores comparable to carefully selected manual models. We provide some perspectives to improve on these performances in future work

INRIA a CCSD electronic archive server

Une approche par boosting à la sélection de modèles pour l’analyse syntaxique statistique

Author: Bawden Rachel
Publication venue: HAL CCSD
Publication date: 26/06/2015
Field of study

INRIA a CCSD electronic archive server

Hal-Diderot

Because Syntax does Matter: Improving Predicate-Argument Structures Parsing Using Syntactic Features

Author: Ribeyre Corentin
Seddah Djamé
Villemonte de La Clergerie Éric
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

International audienceParsing full-fledged predicate-argument structures in a deep syntax framework requires graphs to be predicted. Using the DeepBank (Flickinger et al., 2012) and the Predicate-Argument Structure treebank (Miyao and Tsujii, 2005) as a test field, we show how transition-based parsers, extended to handle connected graphs, benefit from the use of topologically different syntactic features such as dependencies, tree fragments, spines or syntactic paths, bringing a much needed context to the parsing models, improving notably over long distance dependencies and elided coordinate structures. By confirming this positive impact on an accurate 2nd-order graph-based parser (Martins and Almeida, 2014), we establish a new state-of-the-art on these data sets

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot