23 research outputs found
Word Order and Its Variations in Korean: A TAGs Approach
It is generally required that a grammar formalism explain the word-order variation since such the variation is universal phenomenon of all natural languages and that the complex patterns cannot be realized by reodering the terminals. This phenomenon is especially important to nonconfigurational languages that are relatively free in word order. We show how TAG can handle the word -order and its variation in Korean. We derive a new property called Scramble-α innated in scrambling. The domain of scrambling can be realized within the elementary trees of TAG and can be localized under TAG formalism. We propose a new adjoining constraint suitable for the description of word order of Korean. We show that TAG generates the syntactic structures as well as ordering precedence at the same time. We also show that long-distance scrambling is another type of cross-serial dependencies and analyzed with the same principle that local scrambling uses in TAG
A Lexicalized Tree Adjoining Grammar for English
This paper presents a sizable grammar for English written in the Tree Adjoining grammar (TAG) formalism. The grammar uses a TAG that is both lexicalized (Schabes, Abeillé, Joshi 1988) and feature-based (Vijay-Shankar, Joshi 1988). In this paper, we describe a wide range of phenomena that it covers.
A Lexicalized TAG (LTAG) is organized around a lexicon, which associates sets of elementary trees (instead of just simple categories) with the lexical items. A Lexicalized TAG consists of a finite set of trees associated with lexical items, and operations (adjunction and substitution) for composing the trees. A lexical item is called the anchor of its corresponding tree and directly determines both the tree\u27s structure and its syntactic features. In particular, the trees define the domain of locality over which constraints are specified and these constraints are local with respect to their anchor. In this paper, the basic tree structures of the English LTAG are described, along with some relevant features. The interaction between the morphological and the syntactic components of the lexicon is also explained.
Next, the properties of the different tree structures are discussed. The use of S complements exclusively allows us to take full advantage of the treatment of unbounded dependencies originally presented in Joshi (1985) and Kroch and Joshi (1985). Structures for auxiliaries and raising-verbs which use adjunction trees are also discussed. We present a representation of prepositional complements that is based on extended elementary trees. This representation avoids the need for preposition incorporation in order to account for double wh-questions (preposition stranding and pied-piping) and the pseudo-passive.
A treatment of light verb constructions is also given, similar to what Abeillé (1988c) has presented. Again, neither noun nor adjective incorporation is needed to handle double passives and to account for CNPC violations in these constructions. TAG\u27S extended domain of locality allows us to handle, within a single level of syntactic description, phenomena that in other frameworks require either dual analyses or reanalysis.
In addition, following Abeillé and Schabes (1989), we describe how to deal with semantic non compositionality in verb-particle combinations, light verb constructions and idioms, without losing the internal syntactic composition of these structures.
The last sections discuss current work on PRO, case, anaphora and negation, and outline future work on copula constructions and small clauses, optional arguments, adverb movement and the nature of syntactic rules in a lexicalized framework
A Lexicalized Tree Adjoining Grammar for French: The General Framework
We present the first sizable grammar written in the Tree Adjoining Grammar formalism (TAG)1. In particular we have used \u27lexicalized\u27 TAGs as described in [Schabes, Abeillé and Joshi 1988]. We present the linguistic coverage of our grammar, and explain the linguistic reasons which lead us to choose the particular representations. We have shown that a wide range of linguistic phenomena can be handled within the TAG formalism with lexically specified structures only. We first state the basic structures needed for French, with a particular emphasis on TAG\u27s extended domain of locality that enables us to state complex subcategorization phenomena in a natural way. We motivate the choice of the head for the different structures and we contrast the treatment of nominal arguments with that of sentential ones, which is particular to the TAG framework. We also give a detailed analysis of sentential complements, because it has lead us to introduce substitution into the formalism, and because TAG makes interesting predictions in these cases. We discuss the different linguistic phenomena corresponding to adjunction and to substitution respectively. We then move on to \u27light verb\u27 constructions, in which extraction freely occurs out of the predicative NP. They are handled in a TAG straightforwardly as opposed to the usual double analysis. We lastly give an overview of the treatment of adjuncts,and suggest a treatment of idioms which make them fall into the same representations as \u27free\u27 structures
Structure Unification Grammar: A Unifying Framework for Investigating Natural Language
This thesis presents Structure Unification Grammar and demonstrates its suitability as a framework for investigating natural language from a variety of perspectives. Structure Unification Grammar is a linguistic formalism which represents grammatical information as partial descriptions of phrase structure trees, and combines these descriptions by equating their phrase structure tree nodes. This process can be depicted by taking a set of transparencies which each contain a picture of a tree fragment, and overlaying them so they form a picture of a complete phrase structure tree. The nodes which overlap in the resulting picture are those which are equated. The flexibility with which information can be specified in the descriptions of trees and the generality of the combination operation allows a grammar writer or parser to specify exactly what is known where it is known. The specification of grammatical constraints is not restricted to any particular structural or informational domains. This property provides for a very perspicuous representation of grammatical information, and for the representations necessary for incremental parsing.
The perspicuity of SUG\u27s representation is complemented by its high formal power. The formal power of SUG allows other linguistic formalisms to be expressed in it. By themselves these translations are not terribly interesting, but the perspicuity of SUG\u27s representation often allows the central insights of the other investigations to be expressed perspicuously in SUG. Through this process it is possible to unify the insights from a diverse collection of investigations within a single framework, thus furthering our understanding of natural language as a whole. This thesis gives several examples of how insights from investigations into natural language can be captured in SUG. Since these investigations come from a variety of perspectives on natural language, these examples demonstrate that SUG can be used as a unifying framework for investigating natural language
CLiFF Notes: Research In Natural Language Processing at the University of Pennsylvania
The Computational Linguistics Feedback Forum (CLIFF) is a group of students and faculty who gather once a week to discuss the members\u27 current research. As the word feedback suggests, the group\u27s purpose is the sharing of ideas. The group also promotes interdisciplinary contacts between researchers who share an interest in Cognitive Science.
There is no single theme describing the research in Natural Language Processing at Penn. There is work done in CCG, Tree adjoining grammars, intonation, statistical methods, plan inference, instruction understanding, incremental interpretation, language acquisition, syntactic parsing, causal reasoning, free word order languages, ... and many other areas. With this in mind, rather than trying to summarize the varied work currently underway here at Penn, we suggest reading the following abstracts to see how the students and faculty themselves describe their work. Their abstracts illustrate the diversity of interests among the researchers, explain the areas of common interest, and describe some very interesting work in Cognitive Science.
This report is a collection of abstracts from both faculty and graduate students in Computer Science, Psychology and Linguistics. We pride ourselves on the close working relations between these groups, as we believe that the communication among the different departments and the ongoing inter-departmental research not only improves the quality of our work, but makes much of that work possible
Recommended from our members
Planning multisentential English text using communicative acts
The goal of this research is to develop explanation presentation mechanisms for knowledge based
systems which enable them to define domain terminology and concepts, narrate events, elucidate plans,
processes, or propositions and argue to support a claim or advocate action. This requires the development
of devices which select, structure, order and then linguistically realize explanation content as coherent and
cohesive English text.
With the goal of identifying generic explanation presentation strategies, a wide range of naturally
occurring texts were analyzed with respect to their communicative sttucture, function, content and intended
effects on the reader. This motivated an integrated theory of communicative acts which characterizes text at
the level of rhetorical acts (e.g., describe, define, narrate), illocutionary acts (e.g., inform, request), and
locutionary acts (e.g., ask, command). Taken as a whole, the identified communicative acts characterize
the structure, content and intended effects of four types of text: description, narration, exposition,
argument. These text types have distinct effects such as getting the reader to know about entities, to know
about events, to understand plans, processes, or propositions, or to believe propositions or want to
perform actions. In addition to identifying the communicative function and effect of text at multiple levels
of abstraction, this dissertation details a tripartite theory of focus of attention (discourse focus, temporal
focus, and spatial focus) which constrains the planning and linguistic realization of text.
To test the integrated theory of communicative acts and tripartite theory of focus of attention, a text
generation system TEXPLAN (Textual EXplanation PLANner) was implemented that plans and
linguistically realizes multisentential and multiparagraph explanations from knowledge based systems. The
communicative acts identified during text analysis were formalized as over sixty compositional and (in
some cases) recursive plan operators in the library of a hierarchical planner. Discourse, temporal, and
spatial focus models were implemented to track and use attentional information to guide the organization
and realization of text. Because the plan operators distinguish between the communicative function (e.g.,
argue for a proposition) and the expected effect (e.g., the reader believes the proposition) of communicative
acts, the system is able to construct a discourse model of the structure and function of its textual responses
as well as a user model of the expected effects of its responses on the reader's knowledge, beliefs, and
desires. The system uses both the discourse model and user model to guide subsequent utterances. To test
its generality, the system was interfaced to a variety of domain applications including a neuropsychological
diagnosis system, a mission planning system, and a knowledge based mission simulator. The system
produces descriptions, narrations, expositions, and arguments from these applications, thus exhibiting a
broader range of rhetorical coverage than previous text generation systems
Instance-based natural language generation
In recent years, ranking approaches to Natural Language Generation have become increasingly popular. They abandon the idea of generation as a deterministic decision¬
making process in favour of approaches that combine overgeneration with ranking at
some stage in processing.In this thesis, we investigate the use of instance-based ranking methods for surface
realization in Natural Language Generation. Our approach to instance-based Natural
Language Generation employs two basic components: a rule system that generates a
number of realization candidates from a meaning representation and an instance-based
ranker that scores the candidates according to their similarity to examples taken from a
training corpus. The instance-based ranker uses information retrieval methods to rank
output candidates.Our approach is corpus-based in that it uses a treebank (a subset of the Penn Treebank
II containing management succession texts) in combination with manual semantic markup to automatically produce a generation grammar. Furthermore, the corpus
is also used by the instance-based ranker. The semantic annotation of a test portion of
the compiled subcorpus serves as input to the generator.In this thesis, we develop an efficient search technique for identifying the optimal
candidate based on the A*-algorithm, detail the annotation scheme and grammar con¬
struction algorithm and show how a Rete-based production system can be used for
efficient candidate generation. Furthermore, we examine the output of the generator
and discuss issues like input coverage (completeness), fluency and faithfulness that are
relevant to surface generation in general
ETAG, A Formal Model of Competence Knowledge for User Interface Design
Vliet, J.C. van [Promotor]Tauber, M.J. [Copromotor]Veer, G.C. van der [Copromotor