Estonian Dependency Treebank and its annotation scheme

Abstract

<p>In this article, we present Estonian Dependency Treebank, an ongoing corpus annotation project. The size of the treebank, once finished, will be ca 400,000 words. The treebank annotation consists of three layers: morphology, syntactic functions and dependency relations. For each layer, an overview of the labels and the annotation scheme is given.</p><p>As for the actual treebank creation, each text is annotated by two independent annotators, plus a super-annotator, whose task is to solve the discrepancies. The article also gives a short overview of the most frequent sources of dissensions between the annotators.</p&gt

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 04/05/2024