7 research outputs found
Marrying Universal Dependencies and Universal Morphology
The Universal Dependencies (UD) and Universal Morphology (UniMorph) projects
each present schemata for annotating the morphosyntactic details of language.
Each project also provides corpora of annotated text in many languages - UD at
the token level and UniMorph at the type level. As each corpus is built by
different annotators, language-specific decisions hinder the goal of universal
schemata. With compatibility of tags, each project's annotations could be used
to validate the other's. Additionally, the availability of both type- and
token-level resources would be a boon to tasks such as parsing and homograph
disambiguation. To ease this interoperability, we present a deterministic
mapping from Universal Dependencies v2 features into the UniMorph schema. We
validate our approach by lookup in the UniMorph corpora and find a
macro-average of 64.13% recall. We also note incompatibilities due to paucity
of data on either side. Finally, we present a critical evaluation of the
foundations, strengths, and weaknesses of the two annotation projects.Comment: UDW1
The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection
The SIGMORPHON 2019 shared task on cross-lingual transfer and contextual
analysis in morphology examined transfer learning of inflection between 100
language pairs, as well as contextual lemmatization and morphosyntactic
description in 66 languages. The first task evolves past years' inflection
tasks by examining transfer of morphological inflection knowledge from a
high-resource language to a low-resource language. This year also presents a
new second challenge on lemmatization and morphological feature analysis in
context. All submissions featured a neural component and built on either this
year's strong baselines or highly ranked systems from previous years' shared
tasks. Every participating team improved in accuracy over the baselines for the
inflection task (though not Levenshtein distance), and every team in the
contextual analysis task improved on both state-of-the-art neural and
non-neural baselines.Comment: Presented at SIGMORPHON 201