Search CORE

6,405 research outputs found

Evaluating Transformer's Ability to Learn Mildly Context-Sensitive Languages

Author: Steinert-Threlkeld Shane
Wang Shunjie
Publication venue
Publication date: 02/09/2023
Field of study

Despite that Transformers perform well in NLP tasks, recent studies suggest that self-attention is theoretically limited in learning even some regular and context-free languages. These findings motivated us to think about their implications in modeling natural language, which is hypothesized to be mildly context-sensitive. We test Transformer's ability to learn a variety of mildly context-sensitive languages of varying complexities, and find that they generalize well to unseen in-distribution data, but their ability to extrapolate to longer strings is worse than that of LSTMs. Our analyses show that the learned self-attention patterns and representations modeled dependency relations and demonstrated counting behavior, which may have helped the models solve the languages

arXiv.org e-Print Archive

TuLiPA : towards a multi-formalism parsing environment for grammar engineering

Author: Dellert Johannes
Evang Kilian
Kallmeyer Laura
Lichte Timm
Maier Wolfgang
Parmentier Yannick
Publication venue
Publication date: 01/01/2008
Field of study

In this paper, we present an open-source parsing environment (Tübingen Linguistic Parsing Architecture, TuLiPA) which uses Range Concatenation Grammar (RCG) as a pivot formalism, thus opening the way to the parsing of several mildly context-sensitive formalisms. This environment currently supports tree-based grammars (namely Tree-Adjoining Grammars (TAG) and Multi-Component Tree-Adjoining Grammars with Tree Tuples (TT-MCTAG)) and allows computation not only of syntactic structures, but also of the corresponding semantic representations. It is used for the development of a tree-based grammar for German

Hochschulschriftenserver - Universität Frankfurt am Main

TuLiPA : towards a multi-formalism parsing environment for grammar engineering

Author: Dellert Johannes
Evang Kilian
Kallmeyer Laura
Lichte Timm
Maier Wolfgang
Parmentier Yannick
Publication venue
Publication date: 01/01/2008
Field of study

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Hochschulschriftenserver - Universität Frankfurt am Main

Developing a TT-MCTAG for German with an RCG-based parser

Author: Dellert Johannes
Kallmeyer Laura
Lichte Timm
Maier Wolfgang
Parmentier Yannick
Publication venue
Publication date: 01/01/2008
Field of study

Developing linguistic resources, in particular grammars, is known to be a complex task in itself, because of (amongst others) redundancy and consistency issues. Furthermore some languages can reveal themselves hard to describe because of specific characteristics, e.g. the free word order in German. In this context, we present (i) a framework allowing to describe tree-based grammars, and (ii) an actual fragment of a core multicomponent tree-adjoining grammar with tree tuples (TT-MCTAG) for German developed using this framework. This framework combines a metagrammar compiler and a parser based on range concatenation grammar (RCG) to respectively check the consistency and the correction of the grammar. The German grammar being developed within this framework already deals with a wide range of scrambling and extraction phenomena

CiteSeerX

Hochschulschriftenserver - Universität Frankfurt am Main

A hierarchy of mildly context sensitive dependency grammar

Author: Yli-Jyrä Anssi Mikael
Publication venue
Publication date: 01/01/2004
Field of study

The paper presents Colored Multiplanar Link Grammars (CMLG). These grammars are reducible to extended right-linear S-grammars (Wartena 2001) where the storage type S is a concatenation of c pushdowns. The number of colors available in these grammars induces a hierarchy of Classes of CMLGs. By fixing also another parameter in CMLGs, namely the bound t for non-projectivity depth, we get c-Colored t-Non-projective Dependency Grammars (CNDG) that generate acyclic dependency graphs. Thus, CNDGs form a two-dimensional hier- archy of dependency grammars. A part of this hierarchy is mildly context-sensitive and non-projective.The paper presents Colored Multiplanar Link Grammars (CMLG). These grammars are reducible to extended right-linear S-grammars (Wartena 2001) where the storage type S is a concatenation of c pushdowns. The number of colors available in these grammars induces a hierarchy of Classes of CMLGs. By fixing also another parameter in CMLGs, namely the bound t for non-projectivity depth, we get c-Colored t-Non-projective Dependency Grammars (CNDG) that generate acyclic dependency graphs. Thus, CNDGs form a two-dimensional hier- archy of dependency grammars. A part of this hierarchy is mildly context-sensitive and non-projective.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Parsing as Reduction

Author: Fernández-González Daniel
Martins André F. T.
Publication venue
Publication date: 01/01/2015
Field of study

We reduce phrase-representation parsing to dependency parsing. Our reduction is grounded on a new intermediate representation, "head-ordered dependency trees", shown to be isomorphic to constituent trees. By encoding order information in the dependency labels, we show that any off-the-shelf, trainable dependency parser can be used to produce constituents. When this parser is non-projective, we can perform discontinuous parsing in a very natural manner. Despite the simplicity of our approach, experiments show that the resulting parsers are on par with strong baselines, such as the Berkeley parser for English and the best single system in the SPMRL-2014 shared task. Results are particularly striking for discontinuous parsing of German, where we surpass the current state of the art by a wide margin

arXiv.org e-Print Archive

Crossref