3,481 research outputs found
Progress of the PRINCIPLE project: promoting MT for Croatian, Icelandic, Irish and Norwegian
This paper updates the progress made on the PRINCIPLE project, a 2-year action funded by the European Commission un-der the Connecting Europe Facility (CEF) programme. PRINCIPLE focuses on col-lecting high-quality language resources for Croatian, Icelandic, Irish and Norwe-gian, which have been identified as low-resource languages, especially for build-ing effective machine translation (MT) systems. We report initial achievements of the project and ongoing activities aimed at promoting the uptake of neural MT for the low-resource languages of the project
Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary
Cross-lingual model transfer is a compelling and popular method for
predicting annotations in a low-resource language, whereby parallel corpora
provide a bridge to a high-resource language and its associated annotated
corpora. However, parallel data is not readily available for many languages,
limiting the applicability of these approaches. We address these drawbacks in
our framework which takes advantage of cross-lingual word embeddings trained
solely on a high coverage bilingual dictionary. We propose a novel neural
network model for joint training from both sources of data based on
cross-lingual word embeddings, and show substantial empirical improvements over
baseline techniques. We also propose several active learning heuristics, which
result in improvements over competitive benchmark methods.Comment: 5 pages with 2 pages reference. Accepted to appear in ACL 201
- …