69 research outputs found

    Lexicalization and Grammar Development

    Get PDF
    In this paper we present a fully lexicalized grammar formalism as a particularly attractive framework for the specification of natural language grammars. We discuss in detail Feature-based, Lexicalized Tree Adjoining Grammars (FB-LTAGs), a representative of the class of lexicalized grammars. We illustrate the advantages of lexicalized grammars in various contexts of natural language processing, ranging from wide-coverage grammar development to parsing and machine translation. We also present a method for compact and efficient representation of lexicalized trees.Comment: ps file. English w/ German abstract. 10 page

    Encoding Lexicalized Tree Adjoining Grammars with a Nonmonotonic Inheritance Hierarchy

    Get PDF
    This paper shows how DATR, a widely used formal language for lexical knowledge representation, can be used to define an LTAG lexicon as an inheritance hierarchy with internal lexical rules. A bottom-up featural encoding is used for LTAG trees and this allows lexical rules to be implemented as covariation constraints within feature structures. Such an approach eliminates the considerable redundancy otherwise associated with an LTAG lexicon.Comment: Latex source, needs aclap.sty, 8 page

    Evaluation of LTAG parsing with supertag compaction

    Get PDF
    One of the biggest concerns that has been raised over the feasibility of using large-scale LTAGs in NLP is the amount of redundancy within a grammarÂżs elementary tree set. This has led to various proposals on how best to represent grammars in a way that makes them compact and easily maintained (Vijay-Shanker and Schabes, 1992; Becker, 1993; Becker, 1994; Evans, Gazdar and Weir, 1995; Candito, 1996). Unfortunately, while this work can help to make the storage of grammars more efficient, it does nothing to prevent the problem reappearing when the grammar is processed by a parser and the complete set of trees is reproduced. In this paper we are concerned with an approach that addresses this problem of computational redundancy in the trees, and evaluate its effectiveness

    CLiFF Notes: Research In Natural Language Processing at the University of Pennsylvania

    Get PDF
    CLIFF is the Computational Linguists\u27 Feedback Forum. We are a group of students and faculty who gather once a week to hear a presentation and discuss work currently in progress. The \u27feedback\u27 in the group\u27s name is important: we are interested in sharing ideas, in discussing ongoing research, and in bringing together work done by the students and faculty in Computer Science and other departments. However, there are only so many presentations which we can have in a year. We felt that it would be beneficial to have a report which would have, in one place, short descriptions of the work in Natural Language Processing at the University of Pennsylvania. This report then, is a collection of abstracts from both faculty and graduate students, in Computer Science, Psychology and Linguistics. We want to stress the close ties between these groups, as one of the things that we pride ourselves on here at Penn is the communication among different departments and the inter-departmental work. Rather than try to summarize the varied work currently underway at Penn, we suggest reading the abstracts to see how the students and faculty themselves describe their work. The report illustrates the diversity of interests among the researchers here, as well as explaining the areas of common interest. In addition, since it was our intent to put together a document that would be useful both inside and outside of the university, we hope that this report will explain to everyone some of what we are about

    A Lexicalized Tree Adjoining Grammar for English

    Get PDF
    This document describes a sizable grammar of English written in the TAG formalism and implemented for use with the XTAG system. This report and the grammar described herein supersedes the TAG grammar described in an earlier 1995 XTAG technical report. The English grammar described in this report is based on the TAG formalism which has been extended to include lexicalization, and unification-based feature structures. The range of syntactic phenomena that can be handled is large and includes auxiliaries (including inversion), copula, raising and small clause constructions, topicalization, relative clauses, infinitives, gerunds, passives, adjuncts, it-clefts, wh-clefts, PRO constructions, noun-noun modifications, extraposition, determiner sequences, genitives, negation, noun-verb contractions, sentential adjuncts and imperatives. This technical report corresponds to the XTAG Release 8/31/98. The XTAG grammar is continuously updated with the addition of new analyses and modification of old ones, and an online version of this report can be found at the XTAG web page at http://www.cis.upenn.edu/~xtag/Comment: 310 pages, 181 Postscript figures, uses 11pt, psfig.te

    Building factorized TAGs with meta-grammars

    Get PDF
    International audienceHighly compacted TAGs may be built by allowing subtree factorization operators within the elementary trees. While hand-crafting such trees remains possible, a better option arises from a coupling with meta-grammar descriptions. The approach has been validated by the development of FRMG, a wide-coverage French TAG of only 207 trees

    What Does a Grammar Formalism Say About a Language?

    Get PDF
    Over the last ten or fifteen years there has been a shift in generative linguistics away from formalisms based on a procedural interpretation of grammars towards constraint-based formalisms—formalisms that define languages by specifying a set of constraints that characterize the set of well-formed structures analyzing the strings in the language. A natural extension of this trend is to define this set of structures model-theoretically—to define it as the set of mathematical structures that satisfy some set of logical axioms. This approach raises a number of questions about the nature of linguistic theories and the role of grammar formalisms in expressing them. We argue here that the crux of what theories of syntax have to say about language lies in the abstract properties of the sets of structures they license. This is the level that is most directly connected to the empirical basis of these theories and it is the level at which it is possible to make meaningful comparisons between the approaches. From this point of view, grammar formalisms, or (formal frameworks) are primarily means of presenting these properties. Many of the apparent distinctions between formalisms, then, may well be artifacts of their presentation rather than substantive distinctions between the properties of the structures they license. The model-theoretic approach offers a way in which to abstract away from the idiosyncrasies of these presentations. Having said that, we must distinguish between the class of sets of structures licensed by a linguistic theory and the set of structures licensed by a specific instance of the theory—by a grammar expressing that theory. Theories of syntax are not simply accounts of the structure of individual languages in isolation, but rather include assertions about the organization of the structure of human languages in general. These universal aspects of the theories present two challenges for the model-theoretic approach. First, they frequently are not properties of individual structures, but are rather properties of sets of structures. Thus, in capturing these model-theoretically one is not defining sets of structures but is rather defining classes of sets of structures; these are not first order properties. Secondly, the universal aspects of linguistic theories are frequently not explicit, but are consequences of the nature of the formalism that embodies the theory. In capturing these one must develop an explicit axiomatic treatment of the formalism. This is both a challenge and a powerful beneft of the approach. Such re-interpretations tend to raise a variety of issues that are often overlooked in the original formalization. In this report we examine these issues within the context of a model-theoretic reinterpretation of Generalized Phrase-Structure Grammar. While there is little current active research on GPSG, it provides an ideal laboratory for exploring these issues. First, the formalism of GPSG is expressly intended to embody a great deal of the accompanying linguistic theory. Thus it provides a variety of opportunities for examining principles expressed as restrictions on the formalism from a model-theoretic point of view. At the same time, the fact that these restrictions embody universal grammar principles provides us with a variety of opportunities to explore the way in which the linguistic theory expressed by a grammar can transcend the mathematical theory of the structures it licenses. Finally, GPSG, although defined declaratively, is a formalism with restricted generative capacity, a characteristic more typical of the earlier procedural formalisms. As such, one component of the theory it embodies is a claim about the language-theoretic complexity of natural languages. Such claims are difficult to establish for any of the constraint-based approaches to grammar. We can show, however, that the class of sets of trees that are definable within the logical language we employ in reformalizing GPSG is nearly exactly the class of sets of trees definable within the basic GPSG formalism. Thus we are able to capture the language-theoretic consequences of GPSGs restricted formalism by employing a restricted logical language

    Research in the Language, Information and Computation Laboratory of the University of Pennsylvania

    Get PDF
    This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania. It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition. Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue it’s easier than ever to do so: this document is accessible on the “information superhighway”. Just call up http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authors’ abstracts in the web version of this report. The abstracts describe the researchers’ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn
    • 

    corecore