14 research outputs found

    Frequency vs. Association for Constraint Selection in Usage-Based Construction Grammar

    Get PDF
    A usage-based Construction Grammar (CxG) posits that slot-constraints generalize from common exemplar constructions. But what is the best model of constraint generalization? This paper evaluates competing frequency-based and association-based models across eight languages using a metric derived from the Minimum Description Length paradigm. The experiments show that association-based models produce better generalizations across all languages by a significant margin

    Boosting for Efficient Model Selection for Syntactic Parsing

    Get PDF
    International audienceWe present an efficient model selection method using boosting for transition-based constituency parsing. It is designed for exploring a high-dimensional search space, defined by a large set of feature templates, as for example is typically the case when parsing morphologically rich languages. Our method removes the need to manually define heuristic constraints, which are often imposed in current state-of-the-art selection methods. Our experiments for French show that the method is more efficient and is also capable of producing compact, state-of-the-art models

    Improving the Arc-Eager Model with Reverse Parsing

    Get PDF
    A known way to improve the accuracy of dependency parsers is to combine several different parsing algorithms, in such a way that the weaknesses of each of the models can be compensated by the strengths of others. For example, voting-based combination schemes are based on variants of the idea of analyzing each sentence with various parsers, and constructing a combined output where the head of each node is determined by "majority vote" among the different parsers. Typically, such approaches combine very different parsing models to take advantage of the variability in the parsing errors they make. In this paper, we show that consistent improvements in accuracy can be obtained in a much simpler way by combining a single parser with itself. In particular, we start with a greedy implementation of the Nivre pseudo-projective arc-eager algorithm, a well-known left-to-right transition-based parser, and we combine it with a "mirrored" version of the algorithm that analyzes sentences from right to left. To determine which of the two obtained outputs we trust for the head of each node, we use simple criteria based on the length and position of dependency arcs. Experiments on several datasets from the CoNLL-X shared task and the WSJ section of the English Penn Treebank show that the novel combination system obtains better performance than the baseline arc-eager parser in all cases. To test the generality of the approach, we also perform experiments with a different transition system (arc-standard) and a different search strategy (beam search), obtaining similar improvements in all these settings

    PrĂ©diction structurĂ©e pour l’analyse syntaxique en constituants par transitions : modĂšles denses et modĂšles creux

    Get PDF
    International audienceL’article prĂ©sente une mĂ©thode d’analyse syntaxique en constituants par transitions qui se fonde sur une mĂ©thode de pondĂ©ration des analyses par apprentissage profond. Celle-ci est comparĂ©e Ă  une mĂ©thode de pondĂ©ration par perceptron structurĂ©, vue comme plus classique. Nous introduisons tout d’abord un analyseur syntaxique pondĂ©rĂ© par un rĂ©seau de neurones local et glouton qui s’appuie sur des plongements. Ensuite nous prĂ©sentons son extension vers un modĂšle global et Ă  recherche par faisceau. La comparaison avec un modĂšle d’analyse de la famille perceptron global et en faisceau permet de mettre en Ă©vidence les propriĂ©tĂ©s Ă©tonnamment bonnes du modĂšle neuronal Ă  recherche gloutonne

    PrĂ©diction structurĂ©e pour l’analyse syntaxique en constituants par transitions : modĂšles denses et modĂšles creux

    Get PDF
    International audienceL’article prĂ©sente une mĂ©thode d’analyse syntaxique en constituants par transitions qui se fonde sur une mĂ©thode de pondĂ©ration des analyses par apprentissage profond. Celle-ci est comparĂ©e Ă  une mĂ©thode de pondĂ©ration par perceptron structurĂ©, vue comme plus classique. Nous introduisons tout d’abord un analyseur syntaxique pondĂ©rĂ© par un rĂ©seau de neurones local et glouton qui s’appuie sur des plongements. Ensuite nous prĂ©sentons son extension vers un modĂšle global et Ă  recherche par faisceau. La comparaison avec un modĂšle d’analyse de la famille perceptron global et en faisceau permet de mettre en Ă©vidence les propriĂ©tĂ©s Ă©tonnamment bonnes du modĂšle neuronal Ă  recherche gloutonne

    Exposure and Emergence in Usage-Based Grammar: Computational Experiments in 35 Languages

    Full text link
    This paper uses computational experiments to explore the role of exposure in the emergence of construction grammars. While usage-based grammars are hypothesized to depend on a learner's exposure to actual language use, the mechanisms of such exposure have only been studied in a few constructions in isolation. This paper experiments with (i) the growth rate of the constructicon, (ii) the convergence rate of grammars exposed to independent registers, and (iii) the rate at which constructions are forgotten when they have not been recently observed. These experiments show that the lexicon grows more quickly than the grammar and that the growth rate of the grammar is not dependent on the growth rate of the lexicon. At the same time, register-specific grammars converge onto more similar constructions as the amount of exposure increases. This means that the influence of specific registers becomes less important as exposure increases. Finally, the rate at which constructions are forgotten when they have not been recently observed mirrors the growth rate of the constructicon. This paper thus presents a computational model of usage-based grammar that includes both the emergence and the unentrenchment of constructions
    corecore