2,323 research outputs found
Lexicalization and Grammar Development
In this paper we present a fully lexicalized grammar formalism as a
particularly attractive framework for the specification of natural language
grammars. We discuss in detail Feature-based, Lexicalized Tree Adjoining
Grammars (FB-LTAGs), a representative of the class of lexicalized grammars. We
illustrate the advantages of lexicalized grammars in various contexts of
natural language processing, ranging from wide-coverage grammar development to
parsing and machine translation. We also present a method for compact and
efficient representation of lexicalized trees.Comment: ps file. English w/ German abstract. 10 page
A Processing Model for Free Word Order Languages
Like many verb-final languages, Germn displays considerable word-order
freedom: there is no syntactic constraint on the ordering of the nominal
arguments of a verb, as long as the verb remains in final position. This effect
is referred to as ``scrambling'', and is interpreted in transformational
frameworks as leftward movement of the arguments. Furthermore, arguments from
an embedded clause may move out of their clause; this effect is referred to as
``long-distance scrambling''. While scrambling has recently received
considerable attention in the syntactic literature, the status of long-distance
scrambling has only rarely been addressed. The reason for this is the
problematic status of the data: not only is long-distance scrambling highly
dependent on pragmatic context, it also is strongly subject to degradation due
to processing constraints. As in the case of center-embedding, it is not
immediately clear whether to assume that observed unacceptability of highly
complex sentences is due to grammatical restrictions, or whether we should
assume that the competence grammar does not place any restrictions on
scrambling (and that, therefore, all such sentences are in fact grammatical),
and the unacceptability of some (or most) of the grammatically possible word
orders is due to processing limitations. In this paper, we will argue for the
second view by presenting a processing model for German.Comment: 23 pages, uuencoded compressed ps file. In {\em Perspectives on
Sentence Processing}, C. Clifton, Jr., L. Frazier and K. Rayner, editors.
Lawrence Erlbaum Associates, 199
XMG : eXtending MetaGrammars to MCTAG
In this paper, we introduce an extension of the XMG system (eXtensibleMeta-Grammar) in order to allow for the description of Multi-Component Tree Adjoining Grammars. In particular, we introduce the XMG formalism and its implementation, and show how the latter makes it possible to extend the system relatively easily to different target formalisms, thus opening the way towards multi-formalism.Dans cet article, nous présentons une extension du systÚme XMG (eXtensible MetaGrammar) afin de permettre la description de grammaires darbres adjoints à composantes multiples. Nous présentons en particulier le formalisme XMG et son implantation et montrons comment celle-ci permet relativement aisément détendre le systÚme à différents formalismes grammaticaux cibles, ouvrant ainsi la voie au multi-formalisme
Encoding Lexicalized Tree Adjoining Grammars with a Nonmonotonic Inheritance Hierarchy
This paper shows how DATR, a widely used formal language for lexical
knowledge representation, can be used to define an LTAG lexicon as an
inheritance hierarchy with internal lexical rules. A bottom-up featural
encoding is used for LTAG trees and this allows lexical rules to be implemented
as covariation constraints within feature structures. Such an approach
eliminates the considerable redundancy otherwise associated with an LTAG
lexicon.Comment: Latex source, needs aclap.sty, 8 page
Supertagged phrase-based statistical machine translation
Until quite recently, extending Phrase-based Statistical Machine Translation (PBSMT) with syntactic structure caused system performance to deteriorate. In this work we show that incorporating lexical syntactic descriptions in the form of supertags can yield significantly better PBSMT systems. We describe a novel PBSMT model that integrates
supertags into the target language model and the target side of the translation model. Two kinds of supertags are employed: those from Lexicalized Tree-Adjoining Grammar
and Combinatory Categorial Grammar. Despite the differences between these two approaches, the supertaggers give similar improvements. In addition to supertagging, we also explore the utility of a surface global grammaticality measure based on combinatory operators. We perform various experiments on the Arabic to English NIST 2005 test set addressing issues such as sparseness, scalability and the utility of system subcomponents. Our best result (0.4688 BLEU) improves by 6.1% relative to a state-of-theart
PBSMT model, which compares very favourably with the leading systems on the NIST 2005 task
Factoring Predicate Argument and Scope Semantics : underspecified Semantics with LTAG
In this paper we propose a compositional semantics for lexicalized tree-adjoining grammar (LTAG). Tree-local multicomponent derivations allow separation of the semantic contribution of a lexical item into one component contributing to the predicate argument structure and a second component contributing to scope semantics. Based on this idea a syntax-semantics interface is presented where the compositional semantics depends only on the derivation structure. It is shown that the derivation structure (and indirectly the locality of derivations) allows an appropriate amount of underspecification. This is illustrated by investigating underspecified representations for quantifier scope ambiguities and related phenomena such as adjunct scope and island constraints
A Lexicalized Tree-Adjoining Grammar for Vietnamese
In this paper, we present the first sizable grammar built for Vietnamese using LTAG, developed over the past two years, named vnLTAG. This grammar aims at modelling written language and is general enough to be both application- and domain-independent. It can be used for the morpho-syntactic tagging and syntactic parsing of Vietnamese texts, as well as text generation. We then present a robust parsing scheme using vnLTAG and a parser for the grammar. We finish with an evaluation using a test suite
Korean to English Translation Using Synchronous TAGs
It is often argued that accurate machine translation requires reference to
contextual knowledge for the correct treatment of linguistic phenomena such as
dropped arguments and accurate lexical selection. One of the historical
arguments in favor of the interlingua approach has been that, since it revolves
around a deep semantic representation, it is better able to handle the types of
linguistic phenomena that are seen as requiring a knowledge-based approach. In
this paper we present an alternative approach, exemplified by a prototype
system for machine translation of English and Korean which is implemented in
Synchronous TAGs. This approach is essentially transfer based, and uses
semantic feature unification for accurate lexical selection of polysemous
verbs. The same semantic features, when combined with a discourse model which
stores previously mentioned entities, can also be used for the recovery of
topicalized arguments. In this paper we concentrate on the translation of
Korean to English.Comment: ps file. 8 page
- âŠ