1,952 research outputs found
Language classification from bilingual word embedding graphs
We study the role of the second language in bilingual word embeddings in
monolingual semantic evaluation tasks. We find strongly and weakly positive
correlations between down-stream task performance and second language
similarity to the target language. Additionally, we show how bilingual word
embeddings can be employed for the task of semantic language classification and
that joint semantic spaces vary in meaningful ways across second languages. Our
results support the hypothesis that semantic language similarity is influenced
by both structural similarity as well as geography/contact.Comment: To be published at Coling 201
Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval
Although more and more language pairs are covered by machine translation
services, there are still many pairs that lack translation resources.
Cross-language information retrieval (CLIR) is an application which needs
translation functionality of a relatively low level of sophistication since
current models for information retrieval (IR) are still based on a
bag-of-words. The Web provides a vast resource for the automatic construction
of parallel corpora which can be used to train statistical translation models
automatically. The resulting translation models can be embedded in several ways
in a retrieval model. In this paper, we will investigate the problem of
automatically mining parallel texts from the Web and different ways of
integrating the translation models within the retrieval process. Our
experiments on standard test collections for CLIR show that the Web-based
translation models can surpass commercial MT systems in CLIR tasks. These
results open the perspective of constructing a fully automatic query
translation device for CLIR at a very low cost.Comment: 37 page
Cross-linguistic transfer in bilinguals reading in two alphabetic orthographies: The grain size accommodation hypothesis
Published online: 12 April 2017Reading acquisition is one of the most complex and demanding learning processes faced by children in their first years of schooling. If reading acquisition is challenging in one language, how is it when reading is acquired simultaneously in two languages? What is the impact of bilingualism on the development of literacy? We review behavioral and neuroimaging evidence from alphabetic writing systems suggesting that early bilingualism modulates reading development. Particularly, we show that cross-linguistic variations and cross-linguistic transfer affect bilingual reading strategies as well as their cognitive underpinnings. We stress the fact that the impact of bilingualism on literacy acquisition depends on the specific combination of languages learned and does not manifest itself similarly across bilingual populations. We argue that these differences can be explained by variations due to orthographic depth in the grain sizes used to perform reading and reading-related tasks. Overall, we propose novel hypotheses to shed light on the behavioral and neural variability observed in reading skills among bilinguals.This work was supported by the European commission (BILITERACY- SH4, ERC-2011-ADG) and the Ministry of Economy and Competitiveness, Madrid, Spain (Grant Nos. PSI20153653383P to M.L., PSI20153673533R to M.C., and SEV3201530490 to the Basque Center on Brain and Language Cognition)
- …