240 research outputs found
Long Distance Pronominalisation and Global Focus
Our corpus of descriptive text contains a significant number of long-distance pronominal references(8.4% of the total). In order to account for howthese pronouns are interpreted, we re-examine Grosz and Sidnerâs theory of the attentional state, and in particular the use of the global focus to supplement centering theory. Our corpus evidence concerning these long-distance pronominal references, as well as studies of the use of descriptions, proper names
and ambiguous uses of pronouns, lead us to conclude that a discourse focus stack mechanism of the type proposed by Sidner is essential to account for the use of these referring expressions. We suggest revising the Grosz & Sidner framework by allowing for the possibility that an entity in a focus space may have special status
Towards Entity Status
Discourse entities are an important construct in computational linguistics. They introduce an additional level of representation between referring expressions and that which they refer to: the level of mental representation. In this thesis, I first explore some semiotic and communication theoretic aspects of discourse entities. Then, I develop the concept of "entity status". Entity status is a meta-variable that collects two dimensions formations about the role that an entity plays a discourse, and management informations about how the entity is created, accessed, and updated. Finally, the concept is applied to two case studies: the first one focusses on the choice of referring expressions in radio news, while the second looks at the conditions under which a discourse entity can be mentioned as a pronoun.DiskursentitĂ€ten sind ein wichtiger Konstrukt in der Computerlinguistik. Sie fĂŒhren eine zusĂ€tzliche ReprĂ€sentationsebene ein zwischen referierenden AusdrĂŒcken, und dem, auf das diese AusdrĂŒcke referieren: die Ebene der mentalen ReprĂ€sentation. In dieser Dissertation erkunde ich zunĂ€chst einige semiotische und kommunikationstheoretische Aspekte von DiskursentitĂ€ten. Danach fĂŒhre ich den Begriff des "EntitĂ€tenstatus" ein. EntitĂ€tenstatus ist eine Meta-Variable, die zwei Dimensionen von Information ĂŒber eine DiskursentitĂ€t vereinigt: Struktur-Informationen ĂŒber die Rolle, die eine EntitĂ€t im Diskurs spielt, und Verwaltungs-Informationen ĂŒber Erstellung, Zugriff und Update. Dieser Begriff wird schlussendlich auf zwei Fallstudien angewendet: die erste Studie konzentriert sich auf die Wahl referierender AusdrĂŒcke in Radionachrichten, wĂ€hrend die zweite Studie die Bedingungen untersucht, in denen eine DiskursentitĂ€t als Pronomen erwĂ€hnt werden kann
Coreference in dialogue
Since the early days of discourse analysis coreference has always been considered a major factor in the formation of texts and dialogues. The repetition of nominal elements and the anaphoric use of pronouns in successive sentences is a fundamental cohesive pattern which ties sentences together and contributes to the coherence of sequences. "La coherence transphrastique trouve dans la pronominalisation un des procedes les plus efficaces" (Stati 1990, 160). The basic structural pattern on which linguists focused their interest in the early 1970s is captured by the following examples: (1) A man entered the house. After closing the door, the man sat down. He was tired. (2) Peter The man entered the house. He was tired. He ..
Processing at the syntax-discourse interface in second language acquisition
The Interface Hypothesis (Sorace and Filiaci, 2006) conjectures that adult second
language learners (L2 learners) who have reached near-native levels of
proficiency in their second language exhibit difficulties at the interface between
syntax and other cognitive domains, most notably at the syntax-discourse
interface. However, research in this area was limited, in that the data were offline,
and thus unable to provide evidence for the nature of the deficit shown
by L2 learners. This thesis presents online data which address the question of
the underlying nature of the difficulties observed in L2 learners at the syntaxdiscourse
interface.
This thesis has extended work on the syntax-discourse interface in L2 learners
by investigating the acquisition of two phenomena at the syntax-discourse interface
in German: the role of word order and pronominalization with respect
to information structure (Experiments 1-3), and the antecedent preferences
of anaphoric demonstrative (the der, die, das series homophonous with the
definite article) and personal pronouns (the er, sie, es series) (Experiments 4-
8). Crucially, this work has used an on-line methodology, the visual-world
paradigm, which allows an insight into the incremental interpretation of interface
phenomena in real-time processing. The data from these experiments
show that L2 learners have difficulty integrating different sources of information
in real-time comprehension efficiently, supporting the Interface Hypothesis.
However, the nature of the processing difficulties which L2 learners demonstrate
in on-line processing was not determined by these studies, resulting in
the question: are L2 learnersâ difficulties a result of a limitation of processing resources, or the inability to deploy those resources effectively? A novel dualtask
experiment (Experiment 9), in which native speakers of German were
placed under processing load simulated the results previously obtained for
L2 learners. It is concluded that syntactic dependencies were constrained by
resource limitation, whereas discourse based dependencies were constrained
by processing resource allocation
Acquisition of motion events in L2 Spanish by German, French and Italian speakers
This article explores the second language acquisition of motion events, with particular regard to cross-linguistic influence between first and second languages. Oral narratives in Spanish as a second language by native speakers of French, German and Italian are compared, together with narratives by native Spanish speakers. Previous analysis on the expression of motion events in these languages showed that Romance languages do not always follow the same pattern; for example, Italian tends to express the component of Path more frequently than French and Spanish. The results of the present study highlight evidence of intra-typological differences, even between languages that are genetically very close. These differences seem to lead speakers to produce cases of conceptual transfer into their second language, Spanish, even when their first language is another Romance language
Language as an instrument of thought
I show that there are good arguments and evidence to boot that support the language as an instrument of thought hypothesis. The underlying mechanisms of language, comprising of expressions structured hierarchically and recursively, provide a perspective (in the form of a conceptual structure) on the world, for it is only via language that certain perspectives are avail- able to us and to our thought processes. These mechanisms provide us with a uniquely human way of thinking and talking about the world that is different to the sort of thinking we share with other animals. If the primary function of language were communication then one would expect that the underlying mechanisms of language will be structured in a way that favours successful communication. I show that not only is this not the case, but that the underlying mechanisms of language are in fact structured in a way to maximise computational efficiency, even if it means causing communicative problems. Moreover, I discuss evidence from comparative, neuropatho- logical, developmental, and neuroscientific evidence that supports the claim that language is an instrument of thought
Entity Coherence for Descriptive Text Structuring
Institute for Communicating and Collaborative SystemsAlthough entity coherence, i.e. the coherence that arises from certain patterns of references to
entities, is of attested importance for characterising a descriptive text structure, whether and how current formal models of entity coherence such as Centering Theory can be used for the purposes of natural language generation remains unclear. This thesis investigates this issue and sets out to explore which of the many formulations of Centering best suits text structuring. In doing this, we assume text
structuring to be a search task where different orderings of propositions are evaluated according to scores assigned by a metric.
The main question behind this study is how to choose a metric of entity coherence among many
alternatives as the only guidance to the text structuring component of a system that produces descriptions of objects. Different ways of defining metrics of entity coherence using Centeringâs notions are discussed and a general corpus-based methodology is introduced to identify which of these metrics constitute the most promising candidates for search-based text structuring before the actual generation
of the descriptive structure takes place.
The performance of a large set of metrics is estimated empirically in a series of computational
experiments using two kinds of data: (i) a reliably annotated corpus representing the genre of interest and (ii) data derived from an existing natural language generation system and ordered according to the instructions of a domain expert.
A final experiment supplements our main methodology by automatically evaluating the best scoring orderings of some of the best performing metrics in comparison to an upper bound defined by orderings produced by multiple experts on additional application-specific data and a lower bound defined by a random baseline.
The main findings are summarised as follows: In general, the simplest metric of entity coherence
constitutes a very robust baseline for both datasets. However, when the metrics are modified
according to an additional constraint on entity coherence, then the baseline is beaten in domain (ii).
The employed modification is supported by the subsidiary evaluation which renders all employed
metrics superior to the random baseline and helps identify the metric which overall constitutes the
most suitable candidate (among the ones investigated) for search-based descriptive text structuring in
domain (ii).
This thesis provides substantial insight into the role of entity coherence as a descriptive text structuring
constraint. Viewing Centering from an NLG perspective raises a series of interesting challenges
that the thesis identifies and attempts to investigate to a certain extent. The general evaluation methodology
and the results of the empirical studies are useful for any subsequent attempt to generate a descriptive
text structure in the context of an application that makes use of the notion of entity coherence
as modelled by Centering
A discourse structural approach to anaphora in Chinese.
SIGLEAvailable from British Library Document Supply Centre- DSC:DX186069 / BLDSC - British Library Document Supply CentreGBUnited Kingdo
- âŠ