63 research outputs found

    Agile Corpus Annotation in Practice: An Overview of Manual and Automatic Annotation of CVs

    Get PDF
    This paper describes work testing agile data annotation by moving away from the traditional, linear phases of corpus creation towards iterative ones and by recognizing the potential for sources of error occurring throughout the annotation process.JRC.DG.G.2-Global security and crisis managemen

    JRC's Participation in the Guided Summarization Task at TAC 2010

    Get PDF
    In this paper we describe our participation in the Guided Summarization Task at the Text Analysis Conference 2010 (TAC'10). The goal of the task was to encourage a deeper semantic analysis of the source documents instead of relying only on document word frequencies to select important concepts. We used the output of our event extraction system and automatic learning of semantically-related terms to capture the required aspects of each particular article category. We submitted two runs: the first uses information extraction tools in combination with co-occurrence of features, the second uses only co-occurrence information. In the following sections we describe our runs and discuss the results attained.JRC.DG.G.2-Global security and crisis managemen

    Wrapping up a Summary: from Representation to Generation

    Get PDF
    The main focus of this work is to investigate robust ways for generating summaries from summary representations without recurring to simple sentence extraction and aiming at more human-like summaries. This is motivated by empirical evidence from TAC 2009 data showing that human summaries contain on average more and shorter sentences than the system summaries. We report encouraging preliminary results comparable to those attained by participating systems at TAC 2009.JRC.DG.G.2-Global security and crisis managemen

    Two uses of anaphora resolution in summarization.

    Get PDF
    Abstract We propose a new method for using anaphoric information in Latent Semantic Analysis (LSA), and discuss its application to develop an LSA-based summarizer which achieves a significantly better performance than a system not using anaphoric information, and a better performance by the ROUGE measure than all but one of the single-document summarizers participating in DUC-2002. Anaphoric information is automatically extracted using a new release of our own anaphora resolution system, GUITAR, which incorporates proper noun resolution. Our summarizer also includes a new approach for automatically identifying the dimensionality reduction of a document on the basis of the desired summarization percentage. Anaphoric information is also used to check the coherence of the summary produced by our summarizer, by a reference checker module which identifies anaphoric resolution errors caused by sentence extraction
    • …
    corecore