4,324 research outputs found

    Between anaphora and deixis...the resolution of the demonstrative noun-phrase ‘that N’

    Get PDF
    Three experiments examined the hypothesis that the demonstrative noun phrase (NP) that N, as an anadeictic expression, preferentially refers to the less salient referent in a discourse representation when used anaphorically, whereas the anaphoric pronoun he or she preferentially refers to the highly-focused referent. The findings, from a sentence completion task and two reading time experiments that used gender to create ambiguous and unambiguous coreference, reveal that the demonstrative NP specifically orients processing toward a less salient referent when there is no gender cue discriminating between different possible referents. These findings show the importance of taking into account the discourse function of the anaphor itself and its influence on the process of searching for the referent

    A Corpus-Based Investigation of Definite Description Use

    Full text link
    We present the results of a study of definite descriptions use in written texts aimed at assessing the feasibility of annotating corpora with information about definite description interpretation. We ran two experiments, in which subjects were asked to classify the uses of definite descriptions in a corpus of 33 newspaper articles, containing a total of 1412 definite descriptions. We measured the agreement among annotators about the classes assigned to definite descriptions, as well as the agreement about the antecedent assigned to those definites that the annotators classified as being related to an antecedent in the text. The most interesting result of this study from a corpus annotation perspective was the rather low agreement (K=0.63) that we obtained using versions of Hawkins' and Prince's classification schemes; better results (K=0.76) were obtained using the simplified scheme proposed by Fraurud that includes only two classes, first-mention and subsequent-mention. The agreement about antecedents was also not complete. These findings raise questions concerning the strategy of evaluating systems for definite description interpretation by comparing their results with a standardized annotation. From a linguistic point of view, the most interesting observations were the great number of discourse-new definites in our corpus (in one of our experiments, about 50% of the definites in the collection were classified as discourse-new, 30% as anaphoric, and 18% as associative/bridging) and the presence of definites which did not seem to require a complete disambiguation.Comment: 47 pages, uses fullname.sty and palatino.st

    Enhancement and suppression effects resulting from information structuring in sentences

    Get PDF
    Information structuring through the use of cleft sentences increases the processing efficiency of references to elements within the scope of focus. Furthermore, there is evidence that putting certain types of emphasis on individual words not only enhances their subsequent processing, but also protects these words from becoming suppressed in the wake of subsequent information, suggesting mechanisms of enhancement and suppression. In Experiment 1, we showed that clefted constructions facilitate the integration of subsequent sentences that make reference to elements within the scope of focus, and that they decrease the efficiency with reference to elements outside of the scope of focus. In Experiment 2, using an auditory text-change-detection paradigm, we showed that focus has similar effects on the strength of memory representations. These results add to the evidence for enhancement and suppression as mechanisms of sentence processing and clarify that the effects occur within sentences having a marked focus structure

    A Mention-Ranking Model for Abstract Anaphora Resolution

    Full text link
    Resolving abstract anaphora is an important, but difficult task for text understanding. Yet, with recent advances in representation learning this task becomes a more tangible aim. A central property of abstract anaphora is that it establishes a relation between the anaphor embedded in the anaphoric sentence and its (typically non-nominal) antecedent. We propose a mention-ranking model that learns how abstract anaphors relate to their antecedents with an LSTM-Siamese Net. We overcome the lack of training data by generating artificial anaphoric sentence--antecedent pairs. Our model outperforms state-of-the-art results on shell noun resolution. We also report first benchmark results on an abstract anaphora subset of the ARRAU corpus. This corpus presents a greater challenge due to a mixture of nominal and pronominal anaphors and a greater range of confounders. We found model variants that outperform the baselines for nominal anaphors, without training on individual anaphor data, but still lag behind for pronominal anaphors. Our model selects syntactically plausible candidates and -- if disregarding syntax -- discriminates candidates using deeper features.Comment: In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP). Copenhagen, Denmar

    Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors

    Full text link
    Existing grammar frameworks do not work out particularly well for controlled natural languages (CNL), especially if they are to be used in predictive editors. I introduce in this paper a new grammar notation, called Codeco, which is designed specifically for CNLs and predictive editors. Two different parsers have been implemented and a large subset of Attempto Controlled English (ACE) has been represented in Codeco. The results show that Codeco is practical, adequate and efficient

    Temporal expression normalisation in natural language texts

    Get PDF
    Automatic annotation of temporal expressions is a research challenge of great interest in the field of information extraction. In this report, I describe a novel rule-based architecture, built on top of a pre-existing system, which is able to normalise temporal expressions detected in English texts. Gold standard temporally-annotated resources are limited in size and this makes research difficult. The proposed system outperforms the state-of-the-art systems with respect to TempEval-2 Shared Task (value attribute) and achieves substantially better results with respect to the pre-existing system on top of which it has been developed. I will also introduce a new free corpus consisting of 2822 unique annotated temporal expressions. Both the corpus and the system are freely available on-line.Comment: 7 pages, 1 figure, 5 table
    corecore