219,981 research outputs found

    Annotating patient clinical records with syntactic chunks and named entities: the Harvey corpus

    Get PDF
    The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning

    brat: a Web-based Tool for NLP-Assisted Text Annotation

    Get PDF
    We introduce the brat rapid annotation tool (BRAT), an intuitive web-based tool for text annotation supported by Natural Language Processing (NLP) technology. BRAT has been developed for rich structured annotation for a variety of NLP tasks and aims to support manual curation efforts and increase annotator productivity using NLP techniques. We discuss several case studies of real-world annotation projects using pre-release versions of BRAT and present an evaluation of annotation assisted by semantic class disambiguation on a multicategory entity mention annotation task, showing a 15 % decrease in total annotation time. BRAT is available under an opensource license from

    RDF/S)XML Linguistic Annotation of Semantic Web Pages

    Full text link
    Although with the Semantic Web initiative much research on web pages semantic annotation has already done by AI researchers, linguistic text annotation, including the semantic one, was originally developed in Corpus Linguistics and its results have been somehow neglected by AI. ..

    Quantifying Orphaned Annotations in Hypothes.is

    Full text link
    Web annotation has been receiving increased attention recently with the organization of the Open Annotation Collaboration and new tools for open annotation, such as Hypothes.is. We investigate the prevalence of orphaned annotations, where neither the live Web page nor an archived copy of the Web page contains the text that had previously been annotated in the Hypothes.is annotation system (containing 20,953 highlighted text annotations). We found that about 22% of highlighted text annotations can no longer be attached to their live Web pages. Unfortunately, only about 12% of these annotations can be reattached using the holdings of current public web archives, leaving the remaining 88% of these annotations orphaned. For those annotations that are still attached, 53% are in danger of becoming orphans if the live Web page changes. This points to the need for archiving the target of an annotation at the time the annotation is created

    Two sides of the same coin : assessing translation quality in two steps through adequacy and acceptability error analysis

    Get PDF
    We propose facilitating the error annotation task of translation quality assessment by introducing an annotation process which consists of two separate steps that are similar to the ones required in the European Standard for translation companies EN 15038: an error analysis for errors relating to acceptability (where the target text as a whole is taken into account, as well as the target text in context), and one for errors relating to adequacy (where source segments are compared to target segments). We present a fine-grained error taxonomy suitable for a diagnostic and comparative analysis of machine translated texts, post-edited texts and human translations. Categories missing in existing metrics have been added, such as lexical issues, coherence issues, and text type-specific issues

    Observing professionals taking notes on screen

    Get PDF
    In this study 38 participants wrote a piece of advice based on reading and annotating information from an extensive Web site. Half of the participants took notes in a separate window, the other half used an advanced annotation tool. In text annotations were far more used than separate notes. The frequency with which features of the note-taking tool notes was used depends on the phase in the process. The association between process phase and the use of features is less clear for the annotation tool. Requirements are formulated for the design of annotation tools
    corecore