655 research outputs found
Parsing as Reduction
We reduce phrase-representation parsing to dependency parsing. Our reduction
is grounded on a new intermediate representation, "head-ordered dependency
trees", shown to be isomorphic to constituent trees. By encoding order
information in the dependency labels, we show that any off-the-shelf, trainable
dependency parser can be used to produce constituents. When this parser is
non-projective, we can perform discontinuous parsing in a very natural manner.
Despite the simplicity of our approach, experiments show that the resulting
parsers are on par with strong baselines, such as the Berkeley parser for
English and the best single system in the SPMRL-2014 shared task. Results are
particularly striking for discontinuous parsing of German, where we surpass the
current state of the art by a wide margin
Crossings as a side effect of dependency lengths
The syntactic structure of sentences exhibits a striking regularity:
dependencies tend to not cross when drawn above the sentence. We investigate
two competing explanations. The traditional hypothesis is that this trend
arises from an independent principle of syntax that reduces crossings
practically to zero. An alternative to this view is the hypothesis that
crossings are a side effect of dependency lengths, i.e. sentences with shorter
dependency lengths should tend to have fewer crossings. We are able to reject
the traditional view in the majority of languages considered. The alternative
hypothesis can lead to a more parsimonious theory of language.Comment: the discussion section has been expanded significantly; in press in
Complexity (Wiley
A hierarchy of mildly context sensitive dependency grammar
The paper presents Colored Multiplanar Link Grammars (CMLG). These grammars are reducible to extended right-linear S-grammars (Wartena 2001) where the storage type S is a concatenation of c pushdowns. The number of colors available in these grammars induces a hierarchy of Classes of CMLGs. By fixing also another parameter in CMLGs, namely the bound t for non-projectivity depth, we get c-Colored t-Non-projective Dependency Grammars (CNDG) that generate acyclic dependency graphs. Thus, CNDGs form a two-dimensional hier- archy of dependency grammars. A part of this hierarchy is mildly context-sensitive and non-projective.The paper presents Colored Multiplanar Link Grammars (CMLG). These grammars are reducible to extended right-linear S-grammars (Wartena 2001) where the storage type S is a concatenation of c pushdowns. The number of colors available in these grammars induces a hierarchy of Classes of CMLGs. By fixing also another parameter in CMLGs, namely the bound t for non-projectivity depth, we get c-Colored t-Non-projective Dependency Grammars (CNDG) that generate acyclic dependency graphs. Thus, CNDGs form a two-dimensional hier- archy of dependency grammars. A part of this hierarchy is mildly context-sensitive and non-projective.Peer reviewe
Memory limitations are hidden in grammar
[Abstract] The ability to produce and understand an unlimited number of different sentences is a hallmark of human language. Linguists have sought to define the essence of this generative capacity using formal grammars that describe the syntactic dependencies between constituents, independent of the computational limitations of the human brain. Here, we evaluate this independence assumption by sampling sentences uniformly from the space of possible syntactic structures. We find that the average dependency distance between syntactically related words, a proxy for memory limitations, is less than expected by chance in a collection of state-of-the-art classes of dependency grammars. Our findings indicate that memory limitations have permeated grammatical descriptions, suggesting that it may be impossible to build a parsimonious theory of human linguistic productivity independent
of non-linguistic cognitive constraints
Memory limitations are hidden in grammar
The ability to produce and understand an unlimited number of different
sentences is a hallmark of human language. Linguists have sought to define the
essence of this generative capacity using formal grammars that describe the
syntactic dependencies between constituents, independent of the computational
limitations of the human brain. Here, we evaluate this independence assumption
by sampling sentences uniformly from the space of possible syntactic
structures. We find that the average dependency distance between syntactically
related words, a proxy for memory limitations, is less than expected by chance
in a collection of state-of-the-art classes of dependency grammars. Our
findings indicate that memory limitations have permeated grammatical
descriptions, suggesting that it may be impossible to build a parsimonious
theory of human linguistic productivity independent of non-linguistic cognitive
constraints.Comment: Version improved with reviewer feedbac
Restricted Non-Projectivity: Coverage vs. Efficiency
[Abstract] In the last decade, various restricted classes of non-projective dependency trees have been proposed with the goal of achieving a good tradeoff between parsing efficiency and coverage of the syntactic structures found in natural languages. We perform an extensive study measuring the coverage of a wide range of such classes on corpora of 30 languages under two different syntactic annotation criteria. The results show that, among the currently known relaxations of projectivity, the best tradeoff between coverage and computational complexity of exact parsing is achieved by either 1-endpoint-crossing trees or MH k trees, depending on the level of coverage desired. We also present some properties of the relation of MH k trees to other relevant classes of trees.Ministerio de EconomĂa y Competitividad;
FFI2014-51978-C2-2-
Global Transition-based Non-projective Dependency Parsing
Shi, Huang, and Lee (2017) obtained state-of-the-art results for English and
Chinese dependency parsing by combining dynamic-programming implementations of
transition-based dependency parsers with a minimal set of bidirectional LSTM
features. However, their results were limited to projective parsing. In this
paper, we extend their approach to support non-projectivity by providing the
first practical implementation of the MH_4 algorithm, an mildly
nonprojective dynamic-programming parser with very high coverage on
non-projective treebanks. To make MH_4 compatible with minimal transition-based
feature sets, we introduce a transition-based interpretation of it in which
parser items are mapped to sequences of transitions. We thus obtain the first
implementation of global decoding for non-projective transition-based parsing,
and demonstrate empirically that it is more effective than its projective
counterpart in parsing a number of highly non-projective languagesComment: Proceedings of ACL 2018. 13 page
- …