32 research outputs found

    A psycholinguistically motivated version of TAG

    Get PDF
    We propose a psycholinguistically moti-vated version of TAG which is designed to model key properties of human sentence processing, viz., incrementality, connect-edness, and prediction. We use findings from human experiments to motivate an in-cremental grammar formalism that makes it possible to build fully connected struc-tures on a word-by-word basis. A key idea of the approach is to explicitly model the prediction of upcoming material and the subsequent verification and integration pro-cesses. We also propose a linking theory that links the predictions of our formalism to experimental data such as reading times, and illustrate how it can capture psycholin-guistic results on the processing of either... or structures and relative clauses.

    Broad-coverage model of prediction in human sentence processing

    Get PDF
    The aim of this thesis is to design and implement a cognitively plausible theory of sentence processing which incorporates a mechanism for modeling a prediction and verification process in human language understanding, and to evaluate the validity of this model on specific psycholinguistic phenomena as well as on broad-coverage, naturally occurring text. Modeling prediction is a timely and relevant contribution to the field because recent experimental evidence suggests that humans predict upcoming structure or lexemes during sentence processing. However, none of the current sentence processing theories capture prediction explicitly. This thesis proposes a novel model of incremental sentence processing that offers an explicit prediction and verification mechanism. In evaluating the proposed model, this thesis also makes a methodological contribution. The design and evaluation of current sentence processing theories are usually based exclusively on experimental results from individual psycholinguistic experiments on specific linguistic structures. However, a theory of language processing in humans should not only work in an experimentally designed environment, but should also have explanatory power for naturally occurring language. This thesis first shows that the Dundee corpus, an eye-tracking corpus of newspaper text, constitutes a valuable additional resource for testing sentence processing theories. I demonstrate that a benchmark processing effect (the subject/object relative clause asymmetry) can be detected in this data set (Chapter 4). I then evaluate two existing theories of sentence processing, Surprisal and Dependency Locality Theory (DLT), on the full Dundee corpus. This constitutes the first broad-coverage comparison of sentence processing theories on naturalistic text. I find that both theories can explain some of the variance in the eye-movement data, and that they capture different aspects of sentence processing (Chapter 5). In Chapter 6, I propose a new theory of sentence processing, which explicitly models prediction and verification processes, and aims to unify the complementary aspects of Surprisal and DLT. The proposed theory implements key cognitive concepts such as incrementality, full connectedness, and memory decay. The underlying grammar formalism is a strictly incremental version of Tree-adjoining Grammar (TAG), Psycholinguistically motivated TAG (PLTAG), which is introduced in Chapter 7. I then describe how the Penn Treebank can be converted into PLTAG format and define an incremental, fully connected broad-coverage parsing algorithm with associated probability model for PLTAG. Evaluation of the PLTAG model shows that it achieves the broad coverage required for testing a psycholinguistic theory on naturalistic data. On the standardized Penn Treebank test set, it approaches the performance of incremental TAG parsers without prediction (Chapter 8). Chapter 9 evaluates the psycholinguistic aspects of the proposed theory by testing it both on a on a selection of established sentence processing phenomena and on the Dundee eye-tracking corpus. The proposed theory can account for a larger range of psycholinguistic case studies than previous theories, and is a significant positive predictor of reading times on broad-coverage text. I show that it can explain a larger proportion of the variance in reading times than either DLT integration cost or Surprisal

    Semantic Role Labeling Improves Incremental Parsing

    Get PDF

    Research in the Language, Information and Computation Laboratory of the University of Pennsylvania

    Get PDF
    This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania. It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition. Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue it’s easier than ever to do so: this document is accessible on the “information superhighway”. Just call up http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authors’ abstracts in the web version of this report. The abstracts describe the researchers’ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn

    Modelling Incremental Self-Repair Processing in Dialogue.

    Get PDF
    PhDSelf-repairs, where speakers repeat themselves, reformulate or restart what they are saying, are pervasive in human dialogue. These phenomena provide a window into real-time human language processing. For explanatory adequacy, a model of dialogue must include mechanisms that account for them. Artificial dialogue agents also need this capability for more natural interaction with human users. This thesis investigates the structure of self-repair and its function in the incremental construction of meaning in interaction. A corpus study shows how the range of self-repairs seen in dialogue cannot be accounted for by looking at surface form alone. More particularly it analyses a string-alignment approach and shows how it is insufficient, provides requirements for a suitable model of incremental context and an ontology of self-repair function. An information-theoretic model is developed which addresses these issues along with a system that automatically detects self-repairs and edit terms on transcripts incrementally with minimal latency, achieving state-of-the-art results. Additionally it is shown to have practical use in the psychiatric domain. The thesis goes on to present a dialogue model to interpret and generate repaired utterances incrementally. When processing repaired rather than fluent utterances, it achieves the same degree of incremental interpretation and incremental representation. Practical implementation methods are presented for an existing dialogue system. Finally, a more pragmatically oriented approach is presented to model self-repairs in a psycholinguistically plausible way. This is achieved through extending the dialogue model to include a probabilistic semantic framework to perform incremental inference in a reference resolution domain. The thesis concludes that at least as fine-grained a model of context as word-by-word is required for realistic models of self-repair, and context must include linguistic action sequences and information update effects. The way dialogue participants process self-repairs to make inferences in real time, rather than filter out their disfluency effects, has been modelled formally and in practical systems.Engineering and Physical Sciences Research Council (EPSRC) Doctoral Training Account (DTA) scholarship from the School of Electronic Engineering and Computer Science at Queen Mary University of London
    corecore