For languages with no annotated resources, unsupervised transfer of natural
language processing models such as named-entity recognition (NER) from
resource-rich languages would be an appealing capability. However, differences
in words and word order across languages make it a challenging problem. To
improve mapping of lexical items across languages, we propose a method that
finds translations based on bilingual word embeddings. To improve robustness to
word order differences, we propose to use self-attention, which allows for a
degree of flexibility with respect to word order. We demonstrate that these
methods achieve state-of-the-art or competitive NER performance on commonly
tested languages under a cross-lingual setting, with much lower resource
requirements than past approaches. We also evaluate the challenges of applying
these methods to Uyghur, a low-resource language.Comment: EMNLP 2018 long pape