618 research outputs found
An Incremental Model of Anaphora and Reference Resolution Based on Resource Situations
Notwithstanding conclusive psychological and corpus evidence that at least some aspects of anaphoric and referential interpretation take place incrementally, and the existence of some computational models of incremental reference resolution, many aspects of the linguistics of incremental reference interpretation still have to be better understood. We propose a model of incremental reference interpretation based on Loebner’s theory of definiteness and on the theory of anaphoric accessibility via resource situations developed in Situation Semantics, and show how this model can account for a variety of psychological results about incremental reference interpretation
Introduction to the Special Issue on Incremental Processing in Dialogue
A brief introduction to the topics discussed in the special issue, and to the individual papers
Introduction to the Special Issue
A brief introduction to the topics discussed in the special issue, and to the individual papers
Resolving pronominal anaphora using commonsense knowledge
Coreference resolution is the task of resolving all expressions in a text that refer to the same entity. Such expressions are often used in writing and speech as shortcuts to avoid repetition. The most frequent form of coreference is the anaphor. To resolve anaphora not only grammatical and syntactical strategies are required, but also semantic approaches should be taken into consideration. This dissertation presents a framework for automatically resolving pronominal anaphora by integrating recent findings from the field of linguistics with new semantic features. Commonsense knowledge is the routine knowledge people have of the everyday world. Because such knowledge is widely used it is frequently omitted from social communications such as texts. It is understandable that without this knowledge computers will have difficulty making sense of textual information. In this dissertation a new set of computational and linguistic features are used in a supervised learning approach to resolve the pronominal anaphora in document. Commonsense knowledge sources such as ConceptNet and WordNet are used and similarity measures are extracted to uncover the elaborative information embedded in the words that can help in the process of anaphora resolution. The anaphoric system is tested on 350 Wall Street Journal articles from the BBN corpus. When compared with other systems available such as BART (Versley et al. 2008) and Charniak and Elsner 2009, our system performed better and also resolved a much wider range of anaphora. We were able to achieve a 92% F-measure on the BBN corpus and an average of 85% F-measure when tested on other genres of documents such as children stories and short stories selected from the web
Centering, Anaphora Resolution, and Discourse Structure
Centering was formulated as a model of the relationship between attentional
state, the form of referring expressions, and the coherence of an utterance
within a discourse segment (Grosz, Joshi and Weinstein, 1986; Grosz, Joshi and
Weinstein, 1995). In this chapter, I argue that the restriction of centering to
operating within a discourse segment should be abandoned in order to integrate
centering with a model of global discourse structure. The within-segment
restriction causes three problems. The first problem is that centers are often
continued over discourse segment boundaries with pronominal referring
expressions whose form is identical to those that occur within a discourse
segment. The second problem is that recent work has shown that listeners
perceive segment boundaries at various levels of granularity. If centering
models a universal processing phenomenon, it is implausible that each listener
is using a different centering algorithm.The third issue is that even for
utterances within a discourse segment, there are strong contrasts between
utterances whose adjacent utterance within a segment is hierarchically recent
and those whose adjacent utterance within a segment is linearly recent. This
chapter argues that these problems can be eliminated by replacing Grosz and
Sidner's stack model of attentional state with an alternate model, the cache
model. I show how the cache model is easily integrated with the centering
algorithm, and provide several types of data from naturally occurring
discourses that support the proposed integrated model. Future work should
provide additional support for these claims with an examination of a larger
corpus of naturally occurring discourses.Comment: 35 pages, uses elsart12, lingmacros, named, psfi
Semantic Ambiguity and Perceived Ambiguity
I explore some of the issues that arise when trying to establish a connection
between the underspecification hypothesis pursued in the NLP literature and
work on ambiguity in semantics and in the psychological literature. A theory of
underspecification is developed `from the first principles', i.e., starting
from a definition of what it means for a sentence to be semantically ambiguous
and from what we know about the way humans deal with ambiguity. An
underspecified language is specified as the translation language of a grammar
covering sentences that display three classes of semantic ambiguity: lexical
ambiguity, scopal ambiguity, and referential ambiguity. The expressions of this
language denote sets of senses. A formalization of defeasible reasoning with
underspecified representations is presented, based on Default Logic. Some
issues to be confronted by such a formalization are discussed.Comment: Latex, 47 pages. Uses tree-dvips.sty, lingmacros.sty, fullname.st
Gestures Indicating Dialogue Structure
Rieser H. Gestures Indicating Dialogue Structure. In: Proceedings of SEMdial 2011, 15th Workshop on the Semantics and Pragmatics of Dialogue. In Press
Inter-Coder Agreement for Computational Linguistics
This article is a survey of methods for measuring agreement among corpus annotators. It exposes the mathematics and underlying assumptions of agreement coefficients, covering Krippendorff's alpha as well as Scott's pi and Cohen's kappa; discusses the use of coefficients in several annotation tasks; and argues that weighted, alpha-like coefficients, traditionally less used than kappa-like measures in computational linguistics, may be more appropriate for many corpus annotation tasks—but that their use makes the interpretation of the value of the coefficient even harder. </jats:p
- …