Search CORE

3 research outputs found

Cross-Lingual Word Embeddings for Morphologically Rich Languages

Author: Bouma Gosse
Noord van, Gerardus
Üstün Ahmet
Publication venue
Publication date: 02/09/2019
Field of study

Cross-lingual word embedding models learn a shared vector space for two or more lan- guages so that words with similar meaning are represented by similar vectors regardless of their language. Although the existing mod- els achieve high performance on pairs of mor- phologically simple languages, they perform very poorly on morphologically rich languages such as Turkish and Finnish. In this pa- per, we propose a morpheme-based model in order to increase the performance of cross- lingual word embeddings on morphologically rich languages. Our model includes a sim- ple extension which enables us to exploit mor- phemes for cross-lingual mapping. We ap- plied our model for the Turkish-Finnish lan- guage pair on the bilingual word translation task. Results show that our model outper- forms the baseline models by 2% in the nearest neighbour ranking

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Cross-Lingual Word Embeddings for Morphologically Rich Languages

Author: Bouma Gosse
Noord van, Gerardus
Üstün Ahmet
Publication venue
Publication date: 02/09/2019
Field of study