1,435 research outputs found
Efficient deep processing of japanese
We present a broad coverage Japanese grammar written in the HPSG formalism with MRS semantics. The grammar is created for use in real world applications, such that robustness and performance issues play an important role. It is connected to a POS tagging and word segmentation tool. This grammar is being developed in a multilingual context, requiring MRS structures that are easily comparable across languages
An integrated architecture for shallow and deep processing
We present an architecture for the integration of shallow and deep NLP components which is aimed at flexible combination of different language technologies for a range of practical current and future applications. In particular, we describe the integration of a high-level HPSG parsing system with different high-performance shallow components, ranging from named entity recognition to chunk parsing and shallow clause recognition. The NLP components enrich a representation of natural language text with layers of new XML meta-information using a single shared data structure, called the text chart. We describe details of the integration methods, and show how information extraction and language checking applications for realworld German text benefit from a deep grammatical analysis
Extraction in Dutch with Lexical Rules
Unbounded dependencies are often modelled by ``traces'' (and ``gap
threading'') in unification-based grammars. Pollard and Sag, however, suggest
an analysis of extraction based on lexical rules, which excludes the notion of
traces (P&S 1994, Chapter 9). In parsing, it suggests a trade of indeterminism
for lexical ambiguity. This paper provides a short introduction to this
approach to extraction with lexical rules, and illustrates the linguistic power
of the approach by applying it to particularly idiosyncratic Dutch extraction
data.Comment: Extension of KONVENS94 publication, 10 pages, PostScrip
Head-initial constructions in japanese
Japanese is often taken to be strictly head-final in its syntax. In our work on a broad-coverage, precision implemented HPSG for Japanese, we have found that while this is generally true, there are nonetheless a few minor exceptions to the broad trend. In this paper, we describe the grammar engineering project, present the exceptions we have found, and conclude that this kind of phenomenon motivates on the one hand the HPSG type hierarchical approach which allows for the statement of both broad generalizations and exceptions to those generalizations and on the other hand the usefulness of grammar engineering as a means of testing linguistic hypotheses
Morphological Productivity in the Lexicon
In this paper we outline a lexical organization for Turkish that makes use of
lexical rules for inflections, derivations, and lexical category changes to
control the proliferation of lexical entries. Lexical rules handle changes in
grammatical roles, enforce type constraints, and control the mapping of
subcategorization frames in valency-changing operations. A lexical inheritance
hierarchy facilitates the enforcement of type constraints. Semantic
compositions in inflections and derivations are constrained by the properties
of the terms and predicates.
The design has been tested as part of a HPSG grammar for Turkish. In terms of
performance, run-time execution of the rules seems to be a far better
alternative than pre-compilation. The latter causes exponential growth in the
lexicon due to intensive use of inflections and derivations in Turkish.Comment: 10 pages LaTeX, {lingmacros,avm,psfig}.sty, 1 figure, 1 bibtex fil
An Abstract Machine for Unification Grammars
This work describes the design and implementation of an abstract machine,
Amalia, for the linguistic formalism ALE, which is based on typed feature
structures. This formalism is one of the most widely accepted in computational
linguistics and has been used for designing grammars in various linguistic
theories, most notably HPSG. Amalia is composed of data structures and a set of
instructions, augmented by a compiler from the grammatical formalism to the
abstract instructions, and a (portable) interpreter of the abstract
instructions. The effect of each instruction is defined using a low-level
language that can be executed on ordinary hardware.
The advantages of the abstract machine approach are twofold. From a
theoretical point of view, the abstract machine gives a well-defined
operational semantics to the grammatical formalism. This ensures that grammars
specified using our system are endowed with well defined meaning. It enables,
for example, to formally verify the correctness of a compiler for HPSG, given
an independent definition. From a practical point of view, Amalia is the first
system that employs a direct compilation scheme for unification grammars that
are based on typed feature structures. The use of amalia results in a much
improved performance over existing systems.
In order to test the machine on a realistic application, we have developed a
small-scale, HPSG-based grammar for a fragment of the Hebrew language, using
Amalia as the development platform. This is the first application of HPSG to a
Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks,
pst-node, psfig, fullname and a macros fil
Encoding Lexicalized Tree Adjoining Grammars with a Nonmonotonic Inheritance Hierarchy
This paper shows how DATR, a widely used formal language for lexical
knowledge representation, can be used to define an LTAG lexicon as an
inheritance hierarchy with internal lexical rules. A bottom-up featural
encoding is used for LTAG trees and this allows lexical rules to be implemented
as covariation constraints within feature structures. Such an approach
eliminates the considerable redundancy otherwise associated with an LTAG
lexicon.Comment: Latex source, needs aclap.sty, 8 page
- …