4 research outputs found
Improving Named Entity Recognition by Jointly Learning to Disambiguate Morphological Tags
Previous studies have shown that linguistic features of a word such as
possession, genitive or other grammatical cases can be employed in word
representations of a named entity recognition (NER) tagger to improve the
performance for morphologically rich languages. However, these taggers require
external morphological disambiguation (MD) tools to function which are hard to
obtain or non-existent for many languages. In this work, we propose a model
which alleviates the need for such disambiguators by jointly learning NER and
MD taggers in languages for which one can provide a list of candidate
morphological analyses. We show that this can be done independent of the
morphological annotation schemes, which differ among languages. Our experiments
employing three different model architectures that join these two tasks show
that joint learning improves NER performance. Furthermore, the morphological
disambiguator's performance is shown to be competitive.Comment: COLING 2018 (accepted
Multilingual is not enough: BERT for Finnish
Deep learning-based language models pretrained on large unannotated text
corpora have been demonstrated to allow efficient transfer learning for natural
language processing, with recent approaches such as the transformer-based BERT
model advancing the state of the art across a variety of tasks. While most work
on these models has focused on high-resource languages, in particular English,
a number of recent efforts have introduced multilingual models that can be
fine-tuned to address tasks in a large number of different languages. However,
we still lack a thorough understanding of the capabilities of these models, in
particular for lower-resourced languages. In this paper, we focus on Finnish
and thoroughly evaluate the multilingual BERT model on a range of tasks,
comparing it with a new Finnish BERT model trained from scratch. The new
language-specific model is shown to systematically and clearly outperform the
multilingual. While the multilingual model largely fails to reach the
performance of previously proposed methods, the custom Finnish BERT model
establishes new state-of-the-art results on all corpora for all reference
tasks: part-of-speech tagging, named entity recognition, and dependency
parsing. We release the model and all related resources created for this study
with open licenses at https://turkunlp.org/finbert
CMU-01 at the SIGMORPHON 2019 Shared Task on Crosslinguality and Context in Morphology
This paper presents the submission by the CMU-01 team to the SIGMORPHON 2019
task 2 of Morphological Analysis and Lemmatization in Context. This task
requires us to produce the lemma and morpho-syntactic description of each token
in a sequence, for 107 treebanks. We approach this task with a hierarchical
neural conditional random field (CRF) model which predicts each coarse-grained
feature (eg. POS, Case, etc.) independently. However, most treebanks are
under-resourced, thus making it challenging to train deep neural models for
them. Hence, we propose a multi-lingual transfer training regime where we
transfer from multiple related languages that share similar typology.Comment: In Proceedings of the ACL-SIGMORPHON 2019 Shared Task:
Crosslinguality and Context in Morpholog
Hierarchical Multi Task Learning with Subword Contextual Embeddings for Languages with Rich Morphology
Morphological information is important for many sequence labeling tasks in
Natural Language Processing (NLP). Yet, existing approaches rely heavily on
manual annotations or external software to capture this information. In this
study, we propose using subword contextual embeddings to capture the
morphological information for languages with rich morphology. In addition, we
incorporate these embeddings in a hierarchical multi-task setting which is not
employed before, to the best of our knowledge. Evaluated on Dependency Parsing
(DEP) and Named Entity Recognition (NER) tasks, which are shown to benefit
greatly from morphological information, our final model outperforms previous
state-of-the-art models on both tasks for the Turkish language. Besides, we
show a net improvement of 18.86% and 4.61% F-1 over the previously proposed
multi-task learner in the same setting for the DEP and the NER tasks,
respectively. Empirical results for five different MTL settings show that
incorporating subword contextual embeddings brings significant improvements for
both tasks. In addition, we observed that multi-task learning consistently
improves the performance of the DEP component