9,511 research outputs found
Lexical Adaptation of Link Grammar to the Biomedical Sublanguage: a Comparative Evaluation of Three Approaches
We study the adaptation of Link Grammar Parser to the biomedical sublanguage
with a focus on domain terms not found in a general parser lexicon. Using two
biomedical corpora, we implement and evaluate three approaches to addressing
unknown words: automatic lexicon expansion, the use of morphological clues, and
disambiguation using a part-of-speech tagger. We evaluate each approach
separately for its effect on parsing performance and consider combinations of
these approaches. In addition to a 45% increase in parsing efficiency, we find
that the best approach, incorporating information from a domain part-of-speech
tagger, offers a statistically signicant 10% relative decrease in error. The
adapted parser is available under an open-source license at
http://www.it.utu.fi/biolg
Efficient Normal-Form Parsing for Combinatory Categorial Grammar
Under categorial grammars that have powerful rules like composition, a simple
n-word sentence can have exponentially many parses. Generating all parses is
inefficient and obscures whatever true semantic ambiguities are in the input.
This paper addresses the problem for a fairly general form of Combinatory
Categorial Grammar, by means of an efficient, correct, and easy to implement
normal-form parsing technique. The parser is proved to find exactly one parse
in each semantic equivalence class of allowable parses; that is, spurious
ambiguity (as carefully defined) is shown to be both safely and completely
eliminated.Comment: 8 pages, LaTeX packaged with three .sty files, also uses cgloss4e.st
Apportioning Development Effort in a Probabilistic LR Parsing System through Evaluation
We describe an implemented system for robust domain-independent syntactic
parsing of English, using a unification-based grammar of part-of-speech and
punctuation labels coupled with a probabilistic LR parser. We present
evaluations of the system's performance along several different dimensions;
these enable us to assess the contribution that each individual part is making
to the success of the system as a whole, and thus prioritise the effort to be
devoted to its further enhancement. Currently, the system is able to parse
around 80% of sentences in a substantial corpus of general text containing a
number of distinct genres. On a random sample of 250 such sentences the system
has a mean crossing bracket rate of 0.71 and recall and precision of 83% and
84% respectively when evaluated against manually-disambiguated analyses.Comment: 10 pages, 1 Postscript figure. To Appear in Proceedings of the
Conference on Empirical Methods in Natural Language Processing, University of
Pennsylvania, May 199
Uniform Representations for Syntax-Semantics Arbitration
Psychological investigations have led to considerable insight into the
working of the human language comprehension system. In this article, we look at
a set of principles derived from psychological findings to argue for a
particular organization of linguistic knowledge along with a particular
processing strategy and present a computational model of sentence processing
based on those principles. Many studies have shown that human sentence
comprehension is an incremental and interactive process in which semantic and
other higher-level information interacts with syntactic information to make
informed commitments as early as possible at a local ambiguity. Early
commitments may be made by using top-down guidance from knowledge of different
types, each of which must be applicable independently of others. Further
evidence from studies of error recovery and delayed decisions points toward an
arbitration mechanism for combining syntactic and semantic information in
resolving ambiguities. In order to account for all of the above, we propose
that all types of linguistic knowledge must be represented in a common form but
must be separable so that they can be applied independently of each other and
integrated at processing time by the arbitrator. We present such a uniform
representation and a computational model called COMPERE based on the
representation and the processing strategy.Comment: 7 pages, uses cogsci94.sty macr
Recommended from our members
Parsing with parallelism : a spreading-activation model of inference processing during text understanding
The past decade of reseatch in Natural Language Processing has universally recognized that, since natural language input is almost always ambiguous with respect to its pragmatic implications, its syntactic parse, and even its lexical analysis (i.e., choice of correct word-sense for an ambiguous word), processing natural language input requires decisions about word meanings, syntactic structure, and pragmatic inferences. The lexical, syntactic, and pragmatic levels of inferencing are not as disparate as they have often been treated in both psychological and artificial intelligence research. In fact, these three levels of analysis interact to form a joint interpretation of text.ATLAST (A Three-level Language Analysis SysTem) is an implemented integration of human language understanding at the lexical, the syntactic, and the pragmatic levels. For psychological validity, ATLAST is based on results of experiments with human subjects. The ATLAST model uses a new architecture which was developed to incorporate three features: spreading activation memory, two-stage syntax, and parallel processing of syntax and semantics. It is also a new framework within which to interpret and tackle unsolved problems through implementation and experimentation
The processing of ambiguous sentences by first and second language learners of English
This study compares the way English-speaking children and adult second language learners of English resolve relative clause attachment ambiguities in sentences such as The dean liked the secretary of the professor who was reading a letter. Two groups of advanced L2 learners of English with Greek or German as their L1 participated in a set of off-line and on-line tasks. While the participants ' disambiguation preferences were influenced by lexical-semantic properties of the preposition linking the two potential antecedent NPs (of vs. with), there was no evidence that they were applying any structure-based ambiguity resolution strategies of the type that have been claimed to influence sentence processing in monolingual adults. These findings differ markedly from those obtained from 6 to 7 yearold monolingual English children in a parallel auditory study (Felser, Marinis, & Clahsen, submitted) in that the children's attachment preferences were not affected by the type of preposition at all. We argue that whereas children primarily rely on structure-based parsing principles during processing, adult L2 learners are guided mainly by non-structural informatio
- …