10 research outputs found

    An Empirical Approach to Temporal Reference Resolution

    Full text link
    This paper presents the results of an empirical investigation of temporal reference resolution in scheduling dialogs. The algorithm adopted is primarily a linear-recency based approach that does not include a model of global focus. A fully automatic system has been developed and evaluated on unseen test data with good results. This paper presents the results of an intercoder reliability study, a model of temporal reference resolution that supports linear recency and has very good coverage, the results of the system evaluated on unseen test data, and a detailed analysis of the dialogs assessing the viability of the approach.Comment: 13 pages, latex using aclap.st

    Inferring Strategies for Sentence Ordering in Multidocument News Summarization

    Full text link
    The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document summarization can be determined from the ordering of sentences in the input article, this is not the case for multidocument summarization where summary sentences may be drawn from different input articles. In this paper, we propose a methodology for studying the properties of ordering information in the news genre and describe experiments done on a corpus of multiple acceptable orderings we developed for the task. Based on these experiments, we implemented a strategy for ordering information that combines constraints from chronological order of events and topical relatedness. Evaluation of our augmented algorithm shows a significant improvement of the ordering over two baseline strategies

    Learning Sentence-internal Temporal Relations

    Get PDF
    In this paper we propose a data intensive approach for inferring sentence-internal temporal relations. Temporal inference is relevant for practical NLP applications which either extract or synthesize temporal information (e.g., summarisation, question answering). Our method bypasses the need for manual coding by exploiting the presence of markers like after", which overtly signal a temporal relation. We first show that models trained on main and subordinate clauses connected with a temporal marker achieve good performance on a pseudo-disambiguation task simulating temporal inference (during testing the temporal marker is treated as unseen and the models must select the right marker from a set of possible candidates). Secondly, we assess whether the proposed approach holds promise for the semi-automatic creation of temporal annotations. Specifically, we use a model trained on noisy and approximate data (i.e., main and subordinate clauses) to predict intra-sentential relations present in TimeBank, a corpus annotated rich temporal information. Our experiments compare and contrast several probabilistic models differing in their feature space, linguistic assumptions and data requirements. We evaluate performance against gold standard corpora and also against human subjects

    What Determines Inter-Coder Agreement in Manual Annotations? A Meta-Analytic Investigation

    Get PDF
    Recent discussions of annotator agreement have mostly centered around its calculation and interpretation, and the correct choice of indices. Although these discussions are important, they only consider the "back-end" of the story, namely, what to do once the data are collected. Just as important in our opinion is to know how agreement is reached in the first place and what factors influence coder agreement as part of the annotation process or setting, as this knowledge can provide concrete guidelines for the planning and set-up of annotation projects. To investigate whether there are factors that consistently impact annotator agreement we conducted a meta-analytic investigation of annotation studies reporting agreement percentages. Our meta-analysis synthesized factors reported in 96 annotation studies from three domains (word-sense disambiguation, prosodic transcriptions, and phonetic transcriptions) and was based on a total of 346 agreement indices. Our analysis identified seven factors that influence reported agreement values: annotation domain, number of categories in a coding scheme, number of annotators in a project, whether annotators received training, the intensity of annotator training, the annotation purpose, and the method used for the calculation of percentage agreements. Based on our results we develop practical recommendations for the assessment, interpretation, calculation, and reporting of coder agreement. We also briefly discuss theoretical implications for the concept of annotation quality

    Temporal processing of news : annotation of temporal expressions, verbal events and temporal relations

    Get PDF
    The ability to capture the temporal dimension of a natural language text is essential to many natural language processing applications, such as Question Answering, Automatic Summarisation, and Information Retrieval. Temporal processing is a ¯eld of Computational Linguistics which aims to access this dimension and derive a precise temporal representation of a natural language text by extracting time expressions, events and temporal relations, and then representing them according to a chosen knowledge framework. This thesis focuses on the investigation and understanding of the di®erent ways time is expressed in natural language, on the implementation of a temporal processing system in accordance with the results of this investigation, on the evaluation of the system, and on the extensive analysis of the errors and challenges that appear during system development. The ultimate goal of this research is to develop the ability to automatically annotate temporal expressions, verbal events and temporal relations in a natural language text. Temporal expression annotation involves two stages: temporal expression identi¯cation concerned with determining the textual extent of a temporal expression, and temporal expression normalisation which ¯nds the value that the temporal expression designates and represents it using an annotation standard. The research presented in this thesis approaches these tasks with a knowledge-based methodology that tackles temporal expressions according to their semantic classi¯cation. Several knowledge sources and normalisation models are experimented with to allow an analysis of their impact on system performance. The annotation of events expressed using either ¯nite or non-¯nite verbs is addressed with a method that overcomes the drawback of existing methods v which associate an event with the class that is most frequently assigned to it in a corpus and are limited in coverage by the small number of events present in the corpus. This limitation is overcome in this research by annotating each WordNet verb with an event class that best characterises that verb. This thesis also describes an original methodology for the identi¯cation of temporal relations that hold among events and temporal expressions. The method relies on sentence-level syntactic trees and a propagation of temporal relations between syntactic constituents, by analysing syntactic and lexical properties of the constituents and of the relations between them. The detailed evaluation and error analysis of the methods proposed for solving di®erent temporal processing tasks form an important part of this research. Various corpora widely used by researchers studying di®erent temporal phenomena are employed in the evaluation, thus enabling comparison with state of the art in the ¯eld. The detailed error analysis targeting each temporal processing task helps identify not only problems of the implemented methods, but also reliability problems of the annotated resources, and encourages potential reexaminations of some temporal processing tasks.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Time, events and temporal relations: an empirical model for temporal processing of Italian texts

    Get PDF
    The aim of this work is the elaboration a computational model for the identification of temporal relations in text/discourse to be used as a component in more complex systems for Open-Domain Question-Answers, Information Extraction and Summarization. More specifically, the thesis will concentrate on the relationships between the various elements which signal temporal relations in Italian texts/discourses, on their roles and how they can be exploited. Time is a pervasive element of human life. It is the primary element thanks to which we are able to observe, describe and reason about what surrounds us and the world. The absence of a correct identification of the temporal ordering of what is narrated and/or described may result in a bad comprehension, which can lead to a misunderstanding. Normally, texts/discourses present situations standing in a particular temporal ordering. Whether these situations precede, or overlap or are included one within the other is inferred during the general process of reading and understanding. Nevertheless, to perform this seemingly easy task, we are taking into account a set of complex information involving different linguistic entities and sources of knowledge. A wide variety of devices is used in natural languages to convey temporal information. Verb tense, temporal prepositions, subordinate conjunctions, adjectival phrases are some of the most obvious. Nevertheless even these obvious devices have different degrees of temporal transparency, which may sometimes be not so obvious as it can appear at a quick and superficial analysis. One of the main shortcomings of previous research on temporal relations is represented by the fact that they concentrated only on a particular discourse segment, namely narrative discourse, disregarding the fact that a text/discourse is composed by different types of discourse segments and relations. A good theory or framework for temporal analysis must take into account all of them. In this work, we have concentrated on the elaboration of a framework which could be applied to all text/discourse segments, without paying too much attention to their type, since we claim that temporal relations can be recovered in every kind of discourse segments and not only in narrative ones. The model we propose is obtained by mixing together theoretical assumptions and empirical data, collected by means of two tests submitted to a total of 35 subjects with different backgrounds. The main results we have obtained from these empirical studies are: (i.) a general evaluation of the difficulty of the task of recovering temporal relations; (ii.) information on the level of granularity of temporal relations; (iii.) a saliency-based order of application of the linguistic devices used to express the temporal relations between two eventualities; (iv.) the proposal of tense temporal polysemy, as a device to identify the set of preferences which can assign unique values to possibly multiple temporal relations. On the basis of the empirical data, we propose to enlarge the set of classical finely grained interval relations (Allen, 1983) by including also coarse-grained temporal relations (Freska, 1992). Moreover, there could be cases in which we are not able to state in a reliable way if there exists a temporal relation or what the particular relation between two entities is. To overcome this issue we have adopted the proposal by Mani (2007) which allows the system to have differentiated levels of temporal representation on the basis of the temporal granularity associated with each discourse segment. The lack of an annotated corpus for eventualities, temporal expressions and temporal relations in Italian represents the biggest shortcomings of this work which has prevented the implementation of the model and its evaluation. Nevertheless, we have been able to conduct a series of experiments for the validation of procedures for the further realization of a working prototype. In addition to this, we have been able to implement and validate a working prototype for the spotting of temporal expressions in texts/discourses

    An Empirical Approach to Temporal Reference Resolution

    No full text

    An Empirical Approach to Temporal Reference Resolution

    No full text
    Scheduling dialogs, during which people negotiate the times of appointments, are common in everyday life. This paper reports the results of an in-depth empirical investigation of resolving explicit temporal references in scheduling dialogs. There are four phases of this work: data annotation and evaluation, model development, system implementation and evaluation, and model evaluation and analysis. The system and model were developed primarily on one set of data, and then applied later to a much more complex data set, to assess the generalizability of the model for the task being performed. Many different types of empirical methods are applied to pinpoint the strengths and weaknesses of the approach. Detailed annotation instructions were developed and an intercoder reliability study was performed, showing that naive annotators can reliably perform the targeted annotations. A fully automatic system has been developed and evaluated on unseen test data, with good results on both data sets...
    corecore