2,401 research outputs found
Neural Cross-Lingual Named Entity Recognition with Minimal Resources
For languages with no annotated resources, unsupervised transfer of natural
language processing models such as named-entity recognition (NER) from
resource-rich languages would be an appealing capability. However, differences
in words and word order across languages make it a challenging problem. To
improve mapping of lexical items across languages, we propose a method that
finds translations based on bilingual word embeddings. To improve robustness to
word order differences, we propose to use self-attention, which allows for a
degree of flexibility with respect to word order. We demonstrate that these
methods achieve state-of-the-art or competitive NER performance on commonly
tested languages under a cross-lingual setting, with much lower resource
requirements than past approaches. We also evaluate the challenges of applying
these methods to Uyghur, a low-resource language.Comment: EMNLP 2018 long pape
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
Pretrained contextual representation models (Peters et al., 2018; Devlin et
al., 2018) have pushed forward the state-of-the-art on many NLP tasks. A new
release of BERT (Devlin, 2018) includes a model simultaneously pretrained on
104 languages with impressive performance for zero-shot cross-lingual transfer
on a natural language inference task. This paper explores the broader
cross-lingual potential of mBERT (multilingual) as a zero shot language
transfer model on 5 NLP tasks covering a total of 39 languages from various
language families: NLI, document classification, NER, POS tagging, and
dependency parsing. We compare mBERT with the best-published methods for
zero-shot cross-lingual transfer and find mBERT competitive on each task.
Additionally, we investigate the most effective strategy for utilizing mBERT in
this manner, determine to what extent mBERT generalizes away from language
specific features, and measure factors that influence cross-lingual transfer.Comment: EMNLP 2019 Camera Read
Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources
For languages with no annotated resources, transferring knowledge from
rich-resource languages is an effective solution for named entity recognition
(NER). While all existing methods directly transfer from source-learned model
to a target language, in this paper, we propose to fine-tune the learned model
with a few similar examples given a test case, which could benefit the
prediction by leveraging the structural and semantic information conveyed in
such similar examples. To this end, we present a meta-learning algorithm to
find a good model parameter initialization that could fast adapt to the given
test case and propose to construct multiple pseudo-NER tasks for meta-training
by computing sentence similarities. To further improve the model's
generalization ability across different languages, we introduce a masking
scheme and augment the loss function with an additional maximum term during
meta-training. We conduct extensive experiments on cross-lingual named entity
recognition with minimal resources over five target languages. The results show
that our approach significantly outperforms existing state-of-the-art methods
across the board.Comment: This paper is accepted by AAAI2020. Code is available at
https://github.com/microsoft/vert-papers/tree/master/papers/Meta-Cros
- …