Search CORE

14 research outputs found

Frequency vs. Association for Constraint Selection in Usage-Based Construction Grammar

Author: Dunn Jonathan
Publication venue
Publication date: 01/01/2019
Field of study

A usage-based Construction Grammar (CxG) posits that slot-constraints generalize from common exemplar constructions. But what is the best model of constraint generalization? This paper evaluates competing frequency-based and association-based models across eight languages using a metric derived from the Minimum Description Length paradigm. The experiments show that association-based models produce better generalizations across all languages by a significant margin

arXiv.org e-Print Archive

Crossref

UC Research Repository

Boosting for Efficient Model Selection for Syntactic Parsing

Author: Bawden Rachel
Crabbé Benoît
Publication venue: HAL CCSD
Publication date: 11/12/2016
Field of study

International audienceWe present an efficient model selection method using boosting for transition-based constituency parsing. It is designed for exploring a high-dimensional search space, defined by a large set of feature templates, as for example is typically the case when parsing morphologically rich languages. Our method removes the need to manually define heuristic constraints, which are often imposed in current state-of-the-art selection methods. Our experiments for French show that the method is more efficient and is also capable of producing compact, state-of-the-art models

INRIA a CCSD electronic archive server

Hal-Diderot

Improving the Arc-Eager Model with Reverse Parsing

Author: Fernández-González Daniel
Gómez-Rodríguez Carlos
Vilares David
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 02/11/2016
Field of study

A known way to improve the accuracy of dependency parsers is to combine several different parsing algorithms, in such a way that the weaknesses of each of the models can be compensated by the strengths of others. For example, voting-based combination schemes are based on variants of the idea of analyzing each sentence with various parsers, and constructing a combined output where the head of each node is determined by "majority vote" among the different parsers. Typically, such approaches combine very different parsing models to take advantage of the variability in the parsing errors they make. In this paper, we show that consistent improvements in accuracy can be obtained in a much simpler way by combining a single parser with itself. In particular, we start with a greedy implementation of the Nivre pseudo-projective arc-eager algorithm, a well-known left-to-right transition-based parser, and we combine it with a "mirrored" version of the algorithm that analyzes sentences from right to left. To determine which of the two obtained outputs we trust for the head of each node, we use simple criteria based on the length and position of dependency arcs. Experiments on several datasets from the CoNLL-X shared task and the WSJ section of the English Penn Treebank show that the novel combination system obtains better performance than the baseline arc-eager parser in all cases. To test the generality of the approach, we also perform experiments with a different transition system (arc-standard) and a different search strategy (beam search), obtaining similar improvements in all these settings

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Prédiction structurée pour l’analyse syntaxique en constituants par transitions : modèles denses et modèles creux

Author: Coavoux Maximin
Crabbé Benoît
Publication venue: ATALA (Association pour le Traitement Automatique des Langues)
Publication date: 01/01/2016
Field of study

International audienceL’article présente une méthode d’analyse syntaxique en constituants par transitions qui se fonde sur une méthode de pondération des analyses par apprentissage profond. Celle-ci est comparée à une méthode de pondération par perceptron structuré, vue comme plus classique. Nous introduisons tout d’abord un analyseur syntaxique pondéré par un réseau de neurones local et glouton qui s’appuie sur des plongements. Ensuite nous présentons son extension vers un modèle global et à recherche par faisceau. La comparaison avec un modèle d’analyse de la famille perceptron global et en faisceau permet de mettre en évidence les propriétés étonnamment bonnes du modèle neuronal à recherche gloutonne

INRIA a CCSD electronic archive server

Hal-Diderot

Prédiction structurée pour l’analyse syntaxique en constituants par transitions : modèles denses et modèles creux

Author: Coavoux Maximin
Crabbé Benoît
Publication venue: ATALA (Association pour le Traitement Automatique des Langues)
Publication date: 01/01/2016
Field of study

INRIA a CCSD electronic archive server

Exposure and Emergence in Usage-Based Grammar: Computational Experiments in 35 Languages

Author: Dunn Jonathan
Publication venue
Publication date: 25/11/2022
Field of study

This paper uses computational experiments to explore the role of exposure in the emergence of construction grammars. While usage-based grammars are hypothesized to depend on a learner's exposure to actual language use, the mechanisms of such exposure have only been studied in a few constructions in isolation. This paper experiments with (i) the growth rate of the constructicon, (ii) the convergence rate of grammars exposed to independent registers, and (iii) the rate at which constructions are forgotten when they have not been recently observed. These experiments show that the lexicon grows more quickly than the grammar and that the growth rate of the grammar is not dependent on the growth rate of the lexicon. At the same time, register-specific grammars converge onto more similar constructions as the amount of exposure increases. This means that the influence of specific registers becomes less important as exposure increases. Finally, the rate at which constructions are forgotten when they have not been recently observed mirrors the growth rate of the constructicon. This paper thus presents a computational model of usage-based grammar that includes both the emergence and the unentrenchment of constructions

arXiv.org e-Print Archive

UC Research Repository