4,324 research outputs found
Between anaphora and deixis...the resolution of the demonstrative noun-phrase ‘that N’
Three experiments examined the hypothesis that the demonstrative noun phrase (NP) that N, as an anadeictic expression, preferentially refers to the less salient referent in a discourse representation when used anaphorically, whereas the anaphoric pronoun he or she preferentially refers to the highly-focused referent. The findings, from a sentence completion task and two reading time experiments that used gender to create ambiguous and unambiguous coreference, reveal that the demonstrative NP specifically orients processing toward a less salient referent when there is no gender cue discriminating between different possible referents. These findings show the importance of taking into account the discourse function of the anaphor itself and its influence on the process of searching for the referent
A Corpus-Based Investigation of Definite Description Use
We present the results of a study of definite descriptions use in written
texts aimed at assessing the feasibility of annotating corpora with information
about definite description interpretation. We ran two experiments, in which
subjects were asked to classify the uses of definite descriptions in a corpus
of 33 newspaper articles, containing a total of 1412 definite descriptions. We
measured the agreement among annotators about the classes assigned to definite
descriptions, as well as the agreement about the antecedent assigned to those
definites that the annotators classified as being related to an antecedent in
the text. The most interesting result of this study from a corpus annotation
perspective was the rather low agreement (K=0.63) that we obtained using
versions of Hawkins' and Prince's classification schemes; better results
(K=0.76) were obtained using the simplified scheme proposed by Fraurud that
includes only two classes, first-mention and subsequent-mention. The agreement
about antecedents was also not complete. These findings raise questions
concerning the strategy of evaluating systems for definite description
interpretation by comparing their results with a standardized annotation. From
a linguistic point of view, the most interesting observations were the great
number of discourse-new definites in our corpus (in one of our experiments,
about 50% of the definites in the collection were classified as discourse-new,
30% as anaphoric, and 18% as associative/bridging) and the presence of
definites which did not seem to require a complete disambiguation.Comment: 47 pages, uses fullname.sty and palatino.st
Enhancement and suppression effects resulting from information structuring in sentences
Information structuring through the use of cleft sentences increases the processing efficiency of references to elements within the scope of focus. Furthermore, there is evidence that putting certain types of emphasis on individual words not only enhances their subsequent processing, but also protects these words from becoming suppressed in the wake of subsequent information, suggesting mechanisms of enhancement and suppression. In Experiment 1, we showed that clefted constructions facilitate the integration of subsequent sentences that make reference to elements within the scope of focus, and that they decrease the efficiency with reference to elements outside of the scope of focus. In Experiment 2, using an auditory text-change-detection paradigm, we showed that focus has similar effects on the strength of memory representations. These results add to the evidence for enhancement and suppression as mechanisms of sentence processing and clarify that the effects occur within sentences having a marked focus structure
A Mention-Ranking Model for Abstract Anaphora Resolution
Resolving abstract anaphora is an important, but difficult task for text
understanding. Yet, with recent advances in representation learning this task
becomes a more tangible aim. A central property of abstract anaphora is that it
establishes a relation between the anaphor embedded in the anaphoric sentence
and its (typically non-nominal) antecedent. We propose a mention-ranking model
that learns how abstract anaphors relate to their antecedents with an
LSTM-Siamese Net. We overcome the lack of training data by generating
artificial anaphoric sentence--antecedent pairs. Our model outperforms
state-of-the-art results on shell noun resolution. We also report first
benchmark results on an abstract anaphora subset of the ARRAU corpus. This
corpus presents a greater challenge due to a mixture of nominal and pronominal
anaphors and a greater range of confounders. We found model variants that
outperform the baselines for nominal anaphors, without training on individual
anaphor data, but still lag behind for pronominal anaphors. Our model selects
syntactically plausible candidates and -- if disregarding syntax --
discriminates candidates using deeper features.Comment: In Proceedings of the 2017 Conference on Empirical Methods in Natural
Language Processing (EMNLP). Copenhagen, Denmar
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Existing grammar frameworks do not work out particularly well for controlled
natural languages (CNL), especially if they are to be used in predictive
editors. I introduce in this paper a new grammar notation, called Codeco, which
is designed specifically for CNLs and predictive editors. Two different parsers
have been implemented and a large subset of Attempto Controlled English (ACE)
has been represented in Codeco. The results show that Codeco is practical,
adequate and efficient
Temporal expression normalisation in natural language texts
Automatic annotation of temporal expressions is a research challenge of great
interest in the field of information extraction. In this report, I describe a
novel rule-based architecture, built on top of a pre-existing system, which is
able to normalise temporal expressions detected in English texts. Gold standard
temporally-annotated resources are limited in size and this makes research
difficult. The proposed system outperforms the state-of-the-art systems with
respect to TempEval-2 Shared Task (value attribute) and achieves substantially
better results with respect to the pre-existing system on top of which it has
been developed. I will also introduce a new free corpus consisting of 2822
unique annotated temporal expressions. Both the corpus and the system are
freely available on-line.Comment: 7 pages, 1 figure, 5 table
- …