Search CORE

142 research outputs found

Lexicalization and Grammar Development

Author: Becker Tilman
Doran Christy
Egedi Dania
Srinivas B.
Publication venue
Publication date: 01/01/1994
Field of study

In this paper we present a fully lexicalized grammar formalism as a particularly attractive framework for the specification of natural language grammars. We discuss in detail Feature-based, Lexicalized Tree Adjoining Grammars (FB-LTAGs), a representative of the class of lexicalized grammars. We illustrate the advantages of lexicalized grammars in various contexts of natural language processing, ranging from wide-coverage grammar development to parsing and machine translation. We also present a method for compact and efficient representation of lexicalized trees.Comment: ps file. English w/ German abstract. 10 page

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Developing a TT-MCTAG for German with an RCG-based parser

Author: Dellert Johannes
Kallmeyer Laura
Lichte Timm
Maier Wolfgang
Parmentier Yannick
Publication venue
Publication date: 01/01/2008
Field of study

Developing linguistic resources, in particular grammars, is known to be a complex task in itself, because of (amongst others) redundancy and consistency issues. Furthermore some languages can reveal themselves hard to describe because of specific characteristics, e.g. the free word order in German. In this context, we present (i) a framework allowing to describe tree-based grammars, and (ii) an actual fragment of a core multicomponent tree-adjoining grammar with tree tuples (TT-MCTAG) for German developed using this framework. This framework combines a metagrammar compiler and a parser based on range concatenation grammar (RCG) to respectively check the consistency and the correction of the grammar. The German grammar being developed within this framework already deals with a wide range of scrambling and extraction phenomena

CiteSeerX

Hochschulschriftenserver - Universität Frankfurt am Main

Disambiguation of Super Parts of Speech (or Supertags): Almost Parsing

Author: Joshi Aravind K.
Srinivas B.
Publication venue
Publication date: 26/10/1994
Field of study

In a lexicalized grammar formalism such as Lexicalized Tree-Adjoining Grammar (LTAG), each lexical item is associated with at least one elementary structure (supertag) that localizes syntactic and semantic dependencies. Thus a parser for a lexicalized grammar must search a large set of supertags to choose the right ones to combine for the parse of the sentence. We present techniques for disambiguating supertags using local information such as lexical preference and local lexical dependencies. The similarity between LTAG and Dependency grammars is exploited in the dependency model of supertag disambiguation. The performance results for various models of supertag disambiguation such as unigram, trigram and dependency-based models are presented.Comment: ps file. 8 page

arXiv.org e-Print Archive

ScholarlyCommons@Penn

Some Novel Applications of Explanation-Based Learning to Parsing Lexicalized Tree-Adjoining Grammars

Author: Joshi Aravind
Srinivas B.
Publication venue
Publication date: 01/01/1995
Field of study

In this paper we present some novel applications of Explanation-Based Learning (EBL) technique to parsing Lexicalized Tree-Adjoining grammars. The novel aspects are (a) immediate generalization of parses in the training set, (b) generalization over recursive structures and (c) representation of generalized parses as Finite State Transducers. A highly impoverished parser called a ``stapler'' has also been introduced. We present experimental results using EBL for different corpora and architectures to show the effectiveness of our approach.Comment: uuencoded postscript fil

arXiv.org e-Print Archive

CiteSeerX

Crossref

ScholarlyCommons@Penn

On Parsing CHILDES

Author: Laakso Aarre
Publication venue
Publication date: 01/01/2005
Field of study

Research on child language acquisition would benefit from the availability of a large body of syntactically parsed utterances between parents and children. We consider the problem of generating such a ``treebank'' from the CHILDES corpus, which currently contains primarily orthographically transcribed speech tagged for lexical category

CogPrints Cognitive Sciences Eprint Archive

Encoding Lexicalized Tree Adjoining Grammars with a Nonmonotonic Inheritance Hierarchy

Author: Evans Roger
Gazdar Gerald
Weir David
Publication venue
Publication date: 01/01/1995
Field of study

This paper shows how DATR, a widely used formal language for lexical knowledge representation, can be used to define an LTAG lexicon as an inheritance hierarchy with internal lexical rules. A bottom-up featural encoding is used for LTAG trees and this allows lexical rules to be implemented as covariation constraints within feature structures. Such an approach eliminates the considerable redundancy otherwise associated with an LTAG lexicon.Comment: Latex source, needs aclap.sty, 8 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Sussex Research Online

Punctuation in Quoted Speech

Author: Doran Christine
Publication venue
Publication date: 01/01/1996
Field of study

Quoted speech is often set off by punctuation marks, in particular quotation marks. Thus, it might seem that the quotation marks would be extremely useful in identifying these structures in texts. Unfortunately, the situation is not quite so clear. In this work, I will argue that quotation marks are not adequate for either identifying or constraining the syntax of quoted speech. More useful information comes from the presence of a quoting verb, which is either a verb of saying or a punctual verb, and the presence of other punctuation marks, usually commas. Using a lexicalized grammar, we can license most quoting clauses as text adjuncts. A distinction will be made not between direct and indirect quoted speech, but rather between adjunct and non-adjunct quoting clauses.Comment: 11 pages, 11 ps figures, Proceedings of SIGPARSE 96 - Punctuation in Computational Linguistic

arXiv.org e-Print Archive

CiteSeerX

TuLiPA : towards a multi-formalism parsing environment for grammar engineering

Author: Dellert Johannes
Evang Kilian
Kallmeyer Laura
Lichte Timm
Maier Wolfgang
Parmentier Yannick
Publication venue
Publication date: 01/01/2008
Field of study

In this paper, we present an open-source parsing environment (Tübingen Linguistic Parsing Architecture, TuLiPA) which uses Range Concatenation Grammar (RCG) as a pivot formalism, thus opening the way to the parsing of several mildly context-sensitive formalisms. This environment currently supports tree-based grammars (namely Tree-Adjoining Grammars (TAG) and Multi-Component Tree-Adjoining Grammars with Tree Tuples (TT-MCTAG)) and allows computation not only of syntactic structures, but also of the corresponding semantic representations. It is used for the development of a tree-based grammar for German

Hochschulschriftenserver - Universität Frankfurt am Main