37 research outputs found
An Efficient Distribution of Labor in a Two Stage Robust Interpretation Process
Although Minimum Distance Parsing (MDP) offers a theoretically attractive
solution to the problem of extragrammaticality, it is often computationally
infeasible in large scale practical applications. In this paper we present an
alternative approach where the labor is distributed between a more restrictive
partial parser and a repair module. Though two stage approaches have grown in
popularity in recent years because of their efficiency, they have done so at
the cost of requiring hand coded repair heuristics. In contrast, our two stage
approach does not require any hand coded knowledge sources dedicated to repair,
thus making it possible to achieve a similar run time advantage over MDP
without losing the quality of domain independence.Comment: 9 pages, 1 Postscript figure, uses aclap.sty and psfig.tex, In
Proceedings of EMNLP 199
GEMINI: A Natural Language System for Spoken-Language Understanding
Gemini is a natural language understanding system developed for spoken
language applications. The paper describes the architecture of Gemini, paying
particular attention to resolving the tension between robustness and
overgeneration. Gemini features a broad-coverage unification-based grammar of
English, fully interleaved syntactic and semantic processing in an all-paths,
bottom-up parser, and an utterance-level parser to find interpretations of
sentences that might not be analyzable as complete sentences. Gemini also
includes novel components for recognizing and correcting grammatical
disfluencies, and for doing parse preferences. This paper presents a
component-by-component view of Gemini, providing detailed relevant measurements
of size, efficiency, and performance.Comment: 8 pages, postscrip
Pseudo-Syntactic Language Modeling for Disfluent Speech Recognition
Language models for speech recognition are generally trained on text corpora. Since these corpora do not contain the disfluencies found in natural speech, there is a train/test mismatch when these models are applied to conversational speech. In this work we investigate a language model (LM) designed to model these disfluencies as a syntactic process. By modeling self-corrections we obtain an improvement over our baseline syntactic model. We also obtain a 30\% relative reduction in perplexity from the best performing standard {N-gram} model when we interpolate it with our syntactically derived models
The acquisition of adjunct control: grammar and processing
This dissertation uses children’s acquisition of adjunct control as a case study
to investigate grammatical and performance accounts of language acquisition. In
previous research, children have consistently exhibited non-adultlike behavior for
sentences with adjunct control. To explain children’s behavior, several different
grammatical accounts have been proposed, but evidence for these accounts has been
inconclusive. In this dissertation, I take two approaches to account for children’s errors.
First, I spell out the predictions of previous grammatical accounts, and test these
predictions after accounting for some methodological concerns that might have
influenced children’s behavior in previous studies. While I reproduce the non-adultlike
behavior observed in previous studies, the predictions of previous grammatical
accounts are not borne out, suggesting that extragrammatical factors are needed to
explain children’s behavior.
Next, I consider the role of two different types of extragrammatical factors in
predicting children’s non-adultlike behavior. With a new task designed to address the
task demands in previous studies, children exhibit significantly higher accuracy than
with previous tasks. This suggests that children’s behavior has been influenced by task-
specific processing factors. In addition to the task, I also test the predictions of a
similarity-based interference account, which links children’s errors to the same
memory mechanisms involved in sentence processing difficulties observed in adults.
These predictions are borne out, supporting a more continuous developmental
trajectory as children’s processing mechanisms become more resistant to interference.
Finally, I consider how children’s errors might influence their acquisition of
adjunct control, given the distribution in the linguistic input. I discuss the results of a
corpus analysis, including the possibility that adjunct control could be learned from the
input. The kinds of information that could be useful to a learner become much more
limited, however, after considering the processing limitations that would interfere with
the representations available to the learner
A classification of ellipsis based on a corpus of information seeking dialogues
The standard classification of ellipsis has determined the way it is handled in natural language understanding (NLU) systems. This work provides a novel classification of ellipsis based on the analysis of ellipsis usage rather than forms in a corpus of information seeking dialogues. The aim is to demonstrate that pragmatic analysis is necessary for the interpretation of ellipsis. The context, in terms of the dialogue participants' belief states, determines interpretation and in turn the interpretation of subsequent utterances. The dialogues produced in a NLU system using this classification are presented
The Radical Unacceptability Hypothesis: Accounting for Unacceptability without Universal Constraints
The Radical Unacceptability Hypothesis (RUH) has been proposed as a way of explaining the unacceptability of extraction from islands and frozen structures. This hypothesis explicitly assumes a distinction between unacceptability due to violations of local well-formedness conditions—conditions on constituency, constituent order, and morphological form—and unacceptability due to extra-grammatical factors. We explore the RUH with respect to classical islands, and extend it to a broader range of phenomena, including freezing, A′ chain interactions, zero-relative clauses, topic islands, weak crossover, extraction from subjects and parasitic gaps, and sensitivity to information structure. The picture that emerges is consistent with the RUH, and suggests more generally that the unacceptability of extraction from otherwise well-formed configurations reflects non-syntactic factors, not principles of grammar.Peer Reviewe