430,085 research outputs found
brat: a Web-based Tool for NLP-Assisted Text Annotation
We introduce the brat rapid annotation tool (BRAT), an intuitive web-based tool for text annotation supported by Natural Language Processing (NLP) technology. BRAT has been developed for rich structured annotation for a variety of NLP tasks and aims to support manual curation efforts and increase annotator productivity using NLP techniques. We discuss several case studies of real-world annotation projects using pre-release versions of BRAT and present an evaluation of annotation assisted by semantic class disambiguation on a multicategory entity mention annotation task, showing a 15 % decrease in total annotation time. BRAT is available under an opensource license from
From phrase structure to dependencies, and back
Transforming constituent-based annotation into dependency-based annotation has been shown to work for different treebanks and annotation schemes (e.g. Lin (1995) has transformed the Penn treebank, and Kübler and Telljohann (2002) the Tübinger Baumbank des Deutschen (TüBa-D/Z)). These ventures are usually triggered by the conflict between theory-neutral annotation, that targets most needs of a wider audience, and theory-specific annotation, that provides more fine-grained information for a smaller audience. As a compromise, it has been pointed out that treebanks can be designed to support more than one theory from the start (Nivre, 2003). We argue that information can also be added to an existing annotation scheme so that it supports additional theory-specific annotations. We also argue that such a transformation is useful for improving and extending the original annotation scheme with respect to both ambiguous annotation and annotation errors. We show this by analysing problems that arise when generating dependency information from the constituent-based TüBa-D/Z
Recommended from our members
*-DCC: A platform to collect, annotate, and explore a large variety of sequencing experiments.
BackgroundOver the past few years the variety of experimental designs and protocols for sequencing experiments increased greatly. To ensure the wide usability of the produced data beyond an individual project, rich and systematic annotation of the underlying experiments is crucial.FindingsWe first developed an annotation structure that captures the overall experimental design as well as the relevant details of the steps from the biological sample to the library preparation, the sequencing procedure, and the sequencing and processed files. Through various design features, such as controlled vocabularies and different field requirements, we ensured a high annotation quality, comparability, and ease of annotation. The structure can be easily adapted to a large variety of species. We then implemented the annotation strategy in a user-hosted web platform with data import, query, and export functionality.ConclusionsWe present here an annotation structure and user-hosted platform for sequencing experiment data, suitable for lab-internal documentation, collaborations, and large-scale annotation efforts
A unified representation for morphological, syntactic, semantic, and referential annotations
This paper reports on the SYN-RA (SYNtax-based Reference Annotation) project, an on-going project of annotating German newspaper texts with referential relations. The project has developed an inventory of anaphoric and coreference relations for German in the context of a unified, XML-based annotation scheme for combining morphological, syntactic, semantic, and anaphoric information. The paper discusses how this unified annotation scheme relates to other formats currently discussed in the literature, in particular the annotation graph model of Bird and Liberman (2001) and the pie-in-thesky scheme for semantic annotation
A participatory action research study on handwritten annotation feedback and its impact on staff and students
Annotation was introduced to a United Kingdom (UK) School of Nursing following an institutional audit within a UK University. Handwritten annotation (writing in the margins of student assignments) was introduced to the grading procedure to enhance the quality of student feedback and learning. Once in practice, annotation could be examined and an action research study facilitated the process. Post-qualifying essay scripts were examined for styles of annotation to identify its strengths and weaknesses. Five staff participated in action research to examine staff perceptions of annotation. Findings showed that words or telegraphic signs that stand alone in the margins of a student essay can be seen as abstract signs to the novitiate reader and need contextualising. If there is a negative tone in the markers’ annotation it can be detected by the student and interpreted as unhelpful or disparaging. There are a number of ways of improving annotation, and good practice guidelines are offered in the conclusion to this paper
Information structure
The guidelines for Information Structure include instructions for the annotation of Information Status (or ‘givenness’), Topic, and Focus, building upon a basic syntactic annotation of nominal phrases and sentences. A procedure for the annotation of these features is proposed
Annotation graphs as a framework for multidimensional linguistic data analysis
In recent work we have presented a formal framework for linguistic annotation
based on labeled acyclic digraphs. These `annotation graphs' offer a simple yet
powerful method for representing complex annotation structures incorporating
hierarchy and overlap. Here, we motivate and illustrate our approach using
discourse-level annotations of text and speech data drawn from the CALLHOME,
COCONUT, MUC-7, DAMSL and TRAINS annotation schemes. With the help of domain
specialists, we have constructed a hybrid multi-level annotation for a fragment
of the Boston University Radio Speech Corpus which includes the following
levels: segment, word, breath, ToBI, Tilt, Treebank, coreference and named
entity. We show how annotation graphs can represent hybrid multi-level
structures which derive from a diverse set of file formats. We also show how
the approach facilitates substantive comparison of multiple annotations of a
single signal based on different theoretical models. The discussion shows how
annotation graphs open the door to wide-ranging integration of tools, formats
and corpora.Comment: 10 pages, 10 figures, Towards Standards and Tools for Discourse
Tagging, Proceedings of the Workshop. pp. 1-10. Association for Computational
Linguistic
Annotation Graphs and Servers and Multi-Modal Resources: Infrastructure for Interdisciplinary Education, Research and Development
Annotation graphs and annotation servers offer infrastructure to support the
analysis of human language resources in the form of time-series data such as
text, audio and video. This paper outlines areas of common need among empirical
linguists and computational linguists. After reviewing examples of data and
tools used or under development for each of several areas, it proposes a common
framework for future tool development, data annotation and resource sharing
based upon annotation graphs and servers.Comment: 8 pages, 6 figure
- …