1,952 research outputs found

    Language classification from bilingual word embedding graphs

    Full text link
    We study the role of the second language in bilingual word embeddings in monolingual semantic evaluation tasks. We find strongly and weakly positive correlations between down-stream task performance and second language similarity to the target language. Additionally, we show how bilingual word embeddings can be employed for the task of semantic language classification and that joint semantic spaces vary in meaningful ways across second languages. Our results support the hypothesis that semantic language similarity is influenced by both structural similarity as well as geography/contact.Comment: To be published at Coling 201

    Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval

    Get PDF
    Although more and more language pairs are covered by machine translation services, there are still many pairs that lack translation resources. Cross-language information retrieval (CLIR) is an application which needs translation functionality of a relatively low level of sophistication since current models for information retrieval (IR) are still based on a bag-of-words. The Web provides a vast resource for the automatic construction of parallel corpora which can be used to train statistical translation models automatically. The resulting translation models can be embedded in several ways in a retrieval model. In this paper, we will investigate the problem of automatically mining parallel texts from the Web and different ways of integrating the translation models within the retrieval process. Our experiments on standard test collections for CLIR show that the Web-based translation models can surpass commercial MT systems in CLIR tasks. These results open the perspective of constructing a fully automatic query translation device for CLIR at a very low cost.Comment: 37 page

    Cross-linguistic transfer in bilinguals reading in two alphabetic orthographies: The grain size accommodation hypothesis

    Get PDF
    Published online: 12 April 2017Reading acquisition is one of the most complex and demanding learning processes faced by children in their first years of schooling. If reading acquisition is challenging in one language, how is it when reading is acquired simultaneously in two languages? What is the impact of bilingualism on the development of literacy? We review behavioral and neuroimaging evidence from alphabetic writing systems suggesting that early bilingualism modulates reading development. Particularly, we show that cross-linguistic variations and cross-linguistic transfer affect bilingual reading strategies as well as their cognitive underpinnings. We stress the fact that the impact of bilingualism on literacy acquisition depends on the specific combination of languages learned and does not manifest itself similarly across bilingual populations. We argue that these differences can be explained by variations due to orthographic depth in the grain sizes used to perform reading and reading-related tasks. Overall, we propose novel hypotheses to shed light on the behavioral and neural variability observed in reading skills among bilinguals.This work was supported by the European commission (BILITERACY- SH4, ERC-2011-ADG) and the Ministry of Economy and Competitiveness, Madrid, Spain (Grant Nos. PSI20153653383P to M.L., PSI20153673533R to M.C., and SEV3201530490 to the Basque Center on Brain and Language Cognition)
    • …
    corecore