240 research outputs found

    Long Distance Pronominalisation and Global Focus

    Get PDF
    Our corpus of descriptive text contains a significant number of long-distance pronominal references(8.4% of the total). In order to account for howthese pronouns are interpreted, we re-examine Grosz and Sidner’s theory of the attentional state, and in particular the use of the global focus to supplement centering theory. Our corpus evidence concerning these long-distance pronominal references, as well as studies of the use of descriptions, proper names and ambiguous uses of pronouns, lead us to conclude that a discourse focus stack mechanism of the type proposed by Sidner is essential to account for the use of these referring expressions. We suggest revising the Grosz & Sidner framework by allowing for the possibility that an entity in a focus space may have special status

    Towards Entity Status

    Get PDF
    Discourse entities are an important construct in computational linguistics. They introduce an additional level of representation between referring expressions and that which they refer to: the level of mental representation. In this thesis, I first explore some semiotic and communication theoretic aspects of discourse entities. Then, I develop the concept of "entity status". Entity status is a meta-variable that collects two dimensions formations about the role that an entity plays a discourse, and management informations about how the entity is created, accessed, and updated. Finally, the concept is applied to two case studies: the first one focusses on the choice of referring expressions in radio news, while the second looks at the conditions under which a discourse entity can be mentioned as a pronoun.DiskursentitĂ€ten sind ein wichtiger Konstrukt in der Computerlinguistik. Sie fĂŒhren eine zusĂ€tzliche ReprĂ€sentationsebene ein zwischen referierenden AusdrĂŒcken, und dem, auf das diese AusdrĂŒcke referieren: die Ebene der mentalen ReprĂ€sentation. In dieser Dissertation erkunde ich zunĂ€chst einige semiotische und kommunikationstheoretische Aspekte von DiskursentitĂ€ten. Danach fĂŒhre ich den Begriff des "EntitĂ€tenstatus" ein. EntitĂ€tenstatus ist eine Meta-Variable, die zwei Dimensionen von Information ĂŒber eine DiskursentitĂ€t vereinigt: Struktur-Informationen ĂŒber die Rolle, die eine EntitĂ€t im Diskurs spielt, und Verwaltungs-Informationen ĂŒber Erstellung, Zugriff und Update. Dieser Begriff wird schlussendlich auf zwei Fallstudien angewendet: die erste Studie konzentriert sich auf die Wahl referierender AusdrĂŒcke in Radionachrichten, wĂ€hrend die zweite Studie die Bedingungen untersucht, in denen eine DiskursentitĂ€t als Pronomen erwĂ€hnt werden kann

    Coreference in dialogue

    Get PDF
    Since the early days of discourse analysis coreference has always been considered a major factor in the formation of texts and dialogues. The repetition of nominal elements and the anaphoric use of pronouns in successive sentences is a fundamental cohesive pattern which ties sentences together and contributes to the coherence of sequences. "La coherence transphrastique trouve dans la pronominalisation un des procedes les plus efficaces" (Stati 1990, 160). The basic structural pattern on which linguists focused their interest in the early 1970s is captured by the following examples: (1) A man entered the house. After closing the door, the man sat down. He was tired. (2) Peter The man entered the house. He was tired. He ..

    Processing at the syntax-discourse interface in second language acquisition

    Get PDF
    The Interface Hypothesis (Sorace and Filiaci, 2006) conjectures that adult second language learners (L2 learners) who have reached near-native levels of proficiency in their second language exhibit difficulties at the interface between syntax and other cognitive domains, most notably at the syntax-discourse interface. However, research in this area was limited, in that the data were offline, and thus unable to provide evidence for the nature of the deficit shown by L2 learners. This thesis presents online data which address the question of the underlying nature of the difficulties observed in L2 learners at the syntaxdiscourse interface. This thesis has extended work on the syntax-discourse interface in L2 learners by investigating the acquisition of two phenomena at the syntax-discourse interface in German: the role of word order and pronominalization with respect to information structure (Experiments 1-3), and the antecedent preferences of anaphoric demonstrative (the der, die, das series homophonous with the definite article) and personal pronouns (the er, sie, es series) (Experiments 4- 8). Crucially, this work has used an on-line methodology, the visual-world paradigm, which allows an insight into the incremental interpretation of interface phenomena in real-time processing. The data from these experiments show that L2 learners have difficulty integrating different sources of information in real-time comprehension efficiently, supporting the Interface Hypothesis. However, the nature of the processing difficulties which L2 learners demonstrate in on-line processing was not determined by these studies, resulting in the question: are L2 learners’ difficulties a result of a limitation of processing resources, or the inability to deploy those resources effectively? A novel dualtask experiment (Experiment 9), in which native speakers of German were placed under processing load simulated the results previously obtained for L2 learners. It is concluded that syntactic dependencies were constrained by resource limitation, whereas discourse based dependencies were constrained by processing resource allocation

    Acquisition of motion events in L2 Spanish by German, French and Italian speakers

    Get PDF
    This article explores the second language acquisition of motion events, with particular regard to cross-linguistic influence between first and second languages. Oral narratives in Spanish as a second language by native speakers of French, German and Italian are compared, together with narratives by native Spanish speakers. Previous analysis on the expression of motion events in these languages showed that Romance languages do not always follow the same pattern; for example, Italian tends to express the component of Path more frequently than French and Spanish. The results of the present study highlight evidence of intra-typological differences, even between languages that are genetically very close. These differences seem to lead speakers to produce cases of conceptual transfer into their second language, Spanish, even when their first language is another Romance language

    Quantitative register analysis across languages

    Get PDF

    Language as an instrument of thought

    Get PDF
    I show that there are good arguments and evidence to boot that support the language as an instrument of thought hypothesis. The underlying mechanisms of language, comprising of expressions structured hierarchically and recursively, provide a perspective (in the form of a conceptual structure) on the world, for it is only via language that certain perspectives are avail- able to us and to our thought processes. These mechanisms provide us with a uniquely human way of thinking and talking about the world that is different to the sort of thinking we share with other animals. If the primary function of language were communication then one would expect that the underlying mechanisms of language will be structured in a way that favours successful communication. I show that not only is this not the case, but that the underlying mechanisms of language are in fact structured in a way to maximise computational efficiency, even if it means causing communicative problems. Moreover, I discuss evidence from comparative, neuropatho- logical, developmental, and neuroscientific evidence that supports the claim that language is an instrument of thought

    Entity Coherence for Descriptive Text Structuring

    Get PDF
    Institute for Communicating and Collaborative SystemsAlthough entity coherence, i.e. the coherence that arises from certain patterns of references to entities, is of attested importance for characterising a descriptive text structure, whether and how current formal models of entity coherence such as Centering Theory can be used for the purposes of natural language generation remains unclear. This thesis investigates this issue and sets out to explore which of the many formulations of Centering best suits text structuring. In doing this, we assume text structuring to be a search task where different orderings of propositions are evaluated according to scores assigned by a metric. The main question behind this study is how to choose a metric of entity coherence among many alternatives as the only guidance to the text structuring component of a system that produces descriptions of objects. Different ways of defining metrics of entity coherence using Centering’s notions are discussed and a general corpus-based methodology is introduced to identify which of these metrics constitute the most promising candidates for search-based text structuring before the actual generation of the descriptive structure takes place. The performance of a large set of metrics is estimated empirically in a series of computational experiments using two kinds of data: (i) a reliably annotated corpus representing the genre of interest and (ii) data derived from an existing natural language generation system and ordered according to the instructions of a domain expert. A final experiment supplements our main methodology by automatically evaluating the best scoring orderings of some of the best performing metrics in comparison to an upper bound defined by orderings produced by multiple experts on additional application-specific data and a lower bound defined by a random baseline. The main findings are summarised as follows: In general, the simplest metric of entity coherence constitutes a very robust baseline for both datasets. However, when the metrics are modified according to an additional constraint on entity coherence, then the baseline is beaten in domain (ii). The employed modification is supported by the subsidiary evaluation which renders all employed metrics superior to the random baseline and helps identify the metric which overall constitutes the most suitable candidate (among the ones investigated) for search-based descriptive text structuring in domain (ii). This thesis provides substantial insight into the role of entity coherence as a descriptive text structuring constraint. Viewing Centering from an NLG perspective raises a series of interesting challenges that the thesis identifies and attempts to investigate to a certain extent. The general evaluation methodology and the results of the empirical studies are useful for any subsequent attempt to generate a descriptive text structure in the context of an application that makes use of the notion of entity coherence as modelled by Centering

    A discourse structural approach to anaphora in Chinese.

    Get PDF
    SIGLEAvailable from British Library Document Supply Centre- DSC:DX186069 / BLDSC - British Library Document Supply CentreGBUnited Kingdo
    • 

    corecore