37 research outputs found

    An Efficient Distribution of Labor in a Two Stage Robust Interpretation Process

    Full text link
    Although Minimum Distance Parsing (MDP) offers a theoretically attractive solution to the problem of extragrammaticality, it is often computationally infeasible in large scale practical applications. In this paper we present an alternative approach where the labor is distributed between a more restrictive partial parser and a repair module. Though two stage approaches have grown in popularity in recent years because of their efficiency, they have done so at the cost of requiring hand coded repair heuristics. In contrast, our two stage approach does not require any hand coded knowledge sources dedicated to repair, thus making it possible to achieve a similar run time advantage over MDP without losing the quality of domain independence.Comment: 9 pages, 1 Postscript figure, uses aclap.sty and psfig.tex, In Proceedings of EMNLP 199

    GEMINI: A Natural Language System for Spoken-Language Understanding

    Full text link
    Gemini is a natural language understanding system developed for spoken language applications. The paper describes the architecture of Gemini, paying particular attention to resolving the tension between robustness and overgeneration. Gemini features a broad-coverage unification-based grammar of English, fully interleaved syntactic and semantic processing in an all-paths, bottom-up parser, and an utterance-level parser to find interpretations of sentences that might not be analyzable as complete sentences. Gemini also includes novel components for recognizing and correcting grammatical disfluencies, and for doing parse preferences. This paper presents a component-by-component view of Gemini, providing detailed relevant measurements of size, efficiency, and performance.Comment: 8 pages, postscrip

    Highlighting Utterances in Chinese Spoken Discourse

    Get PDF

    Pseudo-Syntactic Language Modeling for Disfluent Speech Recognition

    Get PDF
    Language models for speech recognition are generally trained on text corpora. Since these corpora do not contain the disfluencies found in natural speech, there is a train/test mismatch when these models are applied to conversational speech. In this work we investigate a language model (LM) designed to model these disfluencies as a syntactic process. By modeling self-corrections we obtain an improvement over our baseline syntactic model. We also obtain a 30\% relative reduction in perplexity from the best performing standard {N-gram} model when we interpolate it with our syntactically derived models

    The acquisition of adjunct control: grammar and processing

    Get PDF
    This dissertation uses children’s acquisition of adjunct control as a case study to investigate grammatical and performance accounts of language acquisition. In previous research, children have consistently exhibited non-adultlike behavior for sentences with adjunct control. To explain children’s behavior, several different grammatical accounts have been proposed, but evidence for these accounts has been inconclusive. In this dissertation, I take two approaches to account for children’s errors. First, I spell out the predictions of previous grammatical accounts, and test these predictions after accounting for some methodological concerns that might have influenced children’s behavior in previous studies. While I reproduce the non-adultlike behavior observed in previous studies, the predictions of previous grammatical accounts are not borne out, suggesting that extragrammatical factors are needed to explain children’s behavior. Next, I consider the role of two different types of extragrammatical factors in predicting children’s non-adultlike behavior. With a new task designed to address the task demands in previous studies, children exhibit significantly higher accuracy than with previous tasks. This suggests that children’s behavior has been influenced by task- specific processing factors. In addition to the task, I also test the predictions of a similarity-based interference account, which links children’s errors to the same memory mechanisms involved in sentence processing difficulties observed in adults. These predictions are borne out, supporting a more continuous developmental trajectory as children’s processing mechanisms become more resistant to interference. Finally, I consider how children’s errors might influence their acquisition of adjunct control, given the distribution in the linguistic input. I discuss the results of a corpus analysis, including the possibility that adjunct control could be learned from the input. The kinds of information that could be useful to a learner become much more limited, however, after considering the processing limitations that would interfere with the representations available to the learner

    A classification of ellipsis based on a corpus of information seeking dialogues

    Get PDF
    The standard classification of ellipsis has determined the way it is handled in natural language understanding (NLU) systems. This work provides a novel classification of ellipsis based on the analysis of ellipsis usage rather than forms in a corpus of information seeking dialogues. The aim is to demonstrate that pragmatic analysis is necessary for the interpretation of ellipsis. The context, in terms of the dialogue participants' belief states, determines interpretation and in turn the interpretation of subsequent utterances. The dialogues produced in a NLU system using this classification are presented

    The Radical Unacceptability Hypothesis: Accounting for Unacceptability without Universal Constraints

    Get PDF
    The Radical Unacceptability Hypothesis (RUH) has been proposed as a way of explaining the unacceptability of extraction from islands and frozen structures. This hypothesis explicitly assumes a distinction between unacceptability due to violations of local well-formedness conditions—conditions on constituency, constituent order, and morphological form—and unacceptability due to extra-grammatical factors. We explore the RUH with respect to classical islands, and extend it to a broader range of phenomena, including freezing, A′ chain interactions, zero-relative clauses, topic islands, weak crossover, extraction from subjects and parasitic gaps, and sensitivity to information structure. The picture that emerges is consistent with the RUH, and suggests more generally that the unacceptability of extraction from otherwise well-formed configurations reflects non-syntactic factors, not principles of grammar.Peer Reviewe
    corecore