114,423 research outputs found

    Chinese–Spanish neural machine translation enhanced with character and word bitmap fonts

    Get PDF
    Recently, machine translation systems based on neural networks have reached state-of-the-art results for some pairs of languages (e.g., German–English). In this paper, we are investigating the performance of neural machine translation in Chinese–Spanish, which is a challenging language pair. Given that the meaning of a Chinese word can be related to its graphical representation, this work aims to enhance neural machine translation by using as input a combination of: words or characters and their corresponding bitmap fonts. The fact of performing the interpretation of every word or character as a bitmap font generates more informed vectorial representations. Best results are obtained when using words plus their bitmap fonts obtaining an improvement (over a competitive neural MT baseline system) of almost six BLEU, five METEOR points and ranked coherently better in the human evaluation.Peer ReviewedPostprint (published version

    Second-generation adolescents’ competencies and the role of integration policies

    Get PDF
    Immigration into the OECD countries has seen a sharp increase since the middle of the 1980s, even if not at a constant rate. Integration policies are a fundamental tool to help the newly arrived to integrate and assimilate with the native population. While the literature on the immigrants’ integration level is very rich for settlement countries (USA, Canada, Australia and New Zealand) and for the few European countries that have a long tradition of immigration (Germany, UK, France), very little is yet known about other European economies that have only recently become destination countries. Indeed, the availability of data has made difficult to carry out comparative analysis of the integration process of immigrants in most of the EU countries, particularly for the second-generation. This research wants to fill this gap, analysing the role of the socio-economic background in the educational outcome of immigrants. Furthermore, we demonstrate how the effect of the socio-economic background is more or less pronounced in different EU countries that adopt different integration policies and have different education systems. In this work, we concentrate on second-generation adolescents and compare their performances with that of native adolescents and with that of first generation adolescents. The chosen indicator is the score obtained in the 2012 PISA test by each student (native, first and second-generation immigrant) in reading. We compare the results obtained for each of the EU15 member states and for the settlement countries. The results, in line with the prevalent literature, show a strong impact of the socio-economic background on the immigrant adolescents’ performances. The effect is weaker in those countries where the integration policies concern disadvantaged children since an early age

    IMAGINE Final Report

    No full text

    Multimedia search without visual analysis: the value of linguistic and contextual information

    Get PDF
    This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features

    Language contact and language decay. Socio-political and linguistic perspectives

    Get PDF
    The present linguistic situation in Malta is a reflection of historical and political permutations of the past. The simultaneous presence of two languages in Malta – generally described as a bilingual situation, but which in fact includes a number of features which can be defined more appropriately through diglossia – gives rise to a context wherein language contact is extremely frequent: this occurs through both inter- and intrasentential code-switching as well as through the constant integration of foreign terms, mainly from Italian and English, into Maltese. Language policies in Malta are frequently caught in the midst of these dynamic diachronic and synchronic linguistic processes and often operate on two fronts: on the one hand internal changes inherent to the Maltese language must be taken into consideration, on the other hand language use, characterized by the presence of both English and Maltese, also must be accounted for.peer-reviewe

    Supporting asylum seeker and refugee children within the education system in England

    Get PDF

    Visual world studies of conversational perspective taking: similar findings, diverging interpretations

    Get PDF
    Visual-world eyetracking greatly expanded the potential for insight into how listeners access and use common ground during situated language comprehension. Past reviews of visual world studies on perspective taking have largely taken the diverging findings of the various studies at face value, and attributed these apparently different findings to differences in the extent to which the paradigms used by different labs afford collaborative interaction. Researchers are asking questions about perspective taking of an increasingly nuanced and sophisticated nature, a clear indicator of progress. But this research has the potential not only to improve our understanding of conversational perspective taking. Grappling with problems of data interpretation in such a complex domain has the unique potential to drive visual world researchers to a deeper understanding of how to best map visual world data onto psycholinguistic theory. I will argue against this interactional affordances explanation, on two counts. First, it implies that interactivity affects the overall ability to form common ground, and thus provides no straightforward explanation of why, within a single noninteractive study, common ground can have very large effects on some aspects of processing (referential anticipation) while having negligible effects on others (lexical processing). Second, and more importantly, the explanation accepts the divergence in published findings at face value. However, a closer look at several key studies shows that the divergences are more likely to reflect inconsistent practices of analysis and interpretation that have been applied to an underlying body of data that is, in fact, surprisingly consistent. The diverging interpretations, I will argue, are the result of differences in the handling of anticipatory baseline effects (ABEs) in the analysis of visual world data. ABEs arise in perspective-taking studies because listeners have earlier access to constraining information about who knows what than they have to referential speech, and thus can already show biases in visual attention even before the processing of any referential speech has begun. To be sure, these ABEs clearly indicate early access to common ground; however, access does not imply integration, since it is possible that this information is not used later to modulate the processing of incoming speech. Failing to account for these biases using statistical or experimental controls leads to over-optimistic assessments of listeners’ ability to integrate this information with incoming speech. I will show that several key studies with varying degrees of interactional affordances all show similar temporal profiles of common ground use during the interpretive process: early anticipatory effects, followed by bottom-up effects of lexical processing that are not modulated by common ground, followed (optionally) by further late effects that are likely to be post-lexical. Furthermore, this temporal profile for common ground radically differs from the profile of contextual effects related to verb semantics. Together, these findings are consistent with the proposal that lexical processes are encapsulated from common ground, but cannot be straightforwardly accounted for by probabilistic constraint-based approaches
    • 

    corecore