59 research outputs found

    Japanese Discourse and the Process of Centering

    Get PDF
    This paper has two aims: (1) to generalize a computational account of discourse processing called CENTERING and apply it to discourse processing in Japanese, and (2) to provide some insights on the effect of syntactic factors in Japanese on discourse interpretation. We argue that while discourse interpretation is an inferential process, the syntactic cues constrain this process, and demonstrate this argument with respect to the interpretation of ZEROS, unexpressed arguments of the verb, in Japanese. The syntactic cues in Japanese discourse that we investigate are the morphological markers for grammatical TOPIC, the post-position wa, as well as those for grammatical functions such as SUBJECT, ga, OBJECT, o and OBJECT2, ni. In addition, we investigate the role of speakers\u27 EMPATHY, which is the perspective from which an event is described. This is morphologically indicated through the use of verbal compounding, i.e. the auxiliary use of verbs such as kureta, kita. Our results are based on a survey of native speakers of their interpretation of short discourses, consisting of minimal pairs, varied by one of the above factors. We demonstrate that these syntactic cues do indeed affect the interpretation of ZEROS, but that having previously been the TOPIC and being realized as a ZERO also contribute to an entity being interpreted as the TOPIC. We propose a new notion of TOPIC AMBIGUITY, and show that CENTERING provides constraints on when a ZERO can be interpreted as the TOPIC

    Entity Coherence in Comparable Learner Corpora: Seeking Pedagogical Insights

    Get PDF

    Third person pronoun forms in Estonian in the light of centering theory

    Get PDF
    This paper explains the distinctions between the Estonian 3rd person overt pronoun and the zero person marker in spoken narratives. As both forms express the most salient entities in discourse, the saliency criterion cannot distinguish them. The Centering Theory is used to explore if the overt pronoun and zero have different effects on discourse coherence, i.e. whether there is a difference between transition types relating to zero and those signaling the overt pronoun. Additionally, factors such as grammatical role, case and clause type affecting the choice of pronominal forms are studied to supplement results from the Centering analysis. It is hypothesized that the use of the zero form connects to the CONTINUE transition, while the overt pronoun combines with other Centering- based transition types as well. Furthermore, results show that the zero form is more restricted in its usage contexts and signals mainly nominative subjects in main clauses, while the overt form can appear more widely in different linguistic environments

    Events states and times

    Get PDF
    This monograph investigates the temporal interpretation of narrative discourse in two parts. The theme of the first part is narrative progression. It begins with a case study of the adverb ‘now’ and its interaction with the meaning of tense. The case study motivates an ontological distinction between events, states and times and proposes that ‘now’ seeks a prominent state that holds throughout the time described by the tense. Building on prior research, prominence is shown to be influenced by principles of discourse coherence and two coherence principles, NARRATION and RESULT, are given a formally explicit characterization. The key innovation is a new method for testing the definitional adequacy of NARRATION and RESULT, namely by an abductive argument. This contribution opens a new way of thinking about how eventive and stative descriptions contribute to the perceived narrative progression in a discourse. The theme of the second part of the monograph is the semantics and pragmatics of tense. A key innovation is that the present and past tenses are treated as scalar alternatives, a view that is motivated by adopting a particular hypothesis concerning stative predication. The proposed analysis accounts for tense in both matrix clauses and in complements of propositional attitudes, where the notorious double access reading arises. This reading is explored as part of a corpus study that provides a glimpse of how tense semantics interacts with Gricean principles and at-issueness. Several cross-linguistic predictions of the analysis are considered, including their consequences for the Sequence of Tense phenomenon and the Upper Limit Constraint. Finally, a hypothesis is provided about how tense meanings compose with temporal adverbs and verb phrases. Two influential analysis of viewpoint aspect are then compared in light of the hypothesis

    Towards Entity Status

    Get PDF
    Discourse entities are an important construct in computational linguistics. They introduce an additional level of representation between referring expressions and that which they refer to: the level of mental representation. In this thesis, I first explore some semiotic and communication theoretic aspects of discourse entities. Then, I develop the concept of "entity status". Entity status is a meta-variable that collects two dimensions formations about the role that an entity plays a discourse, and management informations about how the entity is created, accessed, and updated. Finally, the concept is applied to two case studies: the first one focusses on the choice of referring expressions in radio news, while the second looks at the conditions under which a discourse entity can be mentioned as a pronoun.DiskursentitĂ€ten sind ein wichtiger Konstrukt in der Computerlinguistik. Sie fĂŒhren eine zusĂ€tzliche ReprĂ€sentationsebene ein zwischen referierenden AusdrĂŒcken, und dem, auf das diese AusdrĂŒcke referieren: die Ebene der mentalen ReprĂ€sentation. In dieser Dissertation erkunde ich zunĂ€chst einige semiotische und kommunikationstheoretische Aspekte von DiskursentitĂ€ten. Danach fĂŒhre ich den Begriff des "EntitĂ€tenstatus" ein. EntitĂ€tenstatus ist eine Meta-Variable, die zwei Dimensionen von Information ĂŒber eine DiskursentitĂ€t vereinigt: Struktur-Informationen ĂŒber die Rolle, die eine EntitĂ€t im Diskurs spielt, und Verwaltungs-Informationen ĂŒber Erstellung, Zugriff und Update. Dieser Begriff wird schlussendlich auf zwei Fallstudien angewendet: die erste Studie konzentriert sich auf die Wahl referierender AusdrĂŒcke in Radionachrichten, wĂ€hrend die zweite Studie die Bedingungen untersucht, in denen eine DiskursentitĂ€t als Pronomen erwĂ€hnt werden kann

    Entity Coherence for Descriptive Text Structuring

    Get PDF
    Institute for Communicating and Collaborative SystemsAlthough entity coherence, i.e. the coherence that arises from certain patterns of references to entities, is of attested importance for characterising a descriptive text structure, whether and how current formal models of entity coherence such as Centering Theory can be used for the purposes of natural language generation remains unclear. This thesis investigates this issue and sets out to explore which of the many formulations of Centering best suits text structuring. In doing this, we assume text structuring to be a search task where different orderings of propositions are evaluated according to scores assigned by a metric. The main question behind this study is how to choose a metric of entity coherence among many alternatives as the only guidance to the text structuring component of a system that produces descriptions of objects. Different ways of defining metrics of entity coherence using Centering’s notions are discussed and a general corpus-based methodology is introduced to identify which of these metrics constitute the most promising candidates for search-based text structuring before the actual generation of the descriptive structure takes place. The performance of a large set of metrics is estimated empirically in a series of computational experiments using two kinds of data: (i) a reliably annotated corpus representing the genre of interest and (ii) data derived from an existing natural language generation system and ordered according to the instructions of a domain expert. A final experiment supplements our main methodology by automatically evaluating the best scoring orderings of some of the best performing metrics in comparison to an upper bound defined by orderings produced by multiple experts on additional application-specific data and a lower bound defined by a random baseline. The main findings are summarised as follows: In general, the simplest metric of entity coherence constitutes a very robust baseline for both datasets. However, when the metrics are modified according to an additional constraint on entity coherence, then the baseline is beaten in domain (ii). The employed modification is supported by the subsidiary evaluation which renders all employed metrics superior to the random baseline and helps identify the metric which overall constitutes the most suitable candidate (among the ones investigated) for search-based descriptive text structuring in domain (ii). This thesis provides substantial insight into the role of entity coherence as a descriptive text structuring constraint. Viewing Centering from an NLG perspective raises a series of interesting challenges that the thesis identifies and attempts to investigate to a certain extent. The general evaluation methodology and the results of the empirical studies are useful for any subsequent attempt to generate a descriptive text structure in the context of an application that makes use of the notion of entity coherence as modelled by Centering

    CLiFF Notes: Research In Natural Language Processing at the University of Pennsylvania

    Get PDF
    The Computational Linguistics Feedback Forum (CLIFF) is a group of students and faculty who gather once a week to discuss the members\u27 current research. As the word feedback suggests, the group\u27s purpose is the sharing of ideas. The group also promotes interdisciplinary contacts between researchers who share an interest in Cognitive Science. There is no single theme describing the research in Natural Language Processing at Penn. There is work done in CCG, Tree adjoining grammars, intonation, statistical methods, plan inference, instruction understanding, incremental interpretation, language acquisition, syntactic parsing, causal reasoning, free word order languages, ... and many other areas. With this in mind, rather than trying to summarize the varied work currently underway here at Penn, we suggest reading the following abstracts to see how the students and faculty themselves describe their work. Their abstracts illustrate the diversity of interests among the researchers, explain the areas of common interest, and describe some very interesting work in Cognitive Science. This report is a collection of abstracts from both faculty and graduate students in Computer Science, Psychology and Linguistics. We pride ourselves on the close working relations between these groups, as we believe that the communication among the different departments and the ongoing inter-departmental research not only improves the quality of our work, but makes much of that work possible
    • 

    corecore