15,184 research outputs found

    Co-activation in the bilingual lexicon: Evidence from Chinese-English bilinguals

    Get PDF
    Investigation of the bilingual mental lexicon suggests that one of its defining characteristics is integration. Words across both languages are subject to parallel co-activation during language processing. An auditory stimulus typing task was used to assess connectivity on the basis of both morphology and phonology. English loanwords in Chinese and transparent English noun-noun compounds with Chinese translation equivalents with corresponding compound structure (corresponding compounds) were used as the critical stimuli. Accent was also manipulated to determine whether or not phonological cues may influence the degree of cross-linguistic co-activation. Results suggest cross-linguistic co-activation on the basis of phonological overlap in different script bilinguals but only weakly supported morphological integration in Chinese-English bilinguals. Accent led to greater co-activation of phonologically similar loanword pairs. Results are discussed in terms of inhibitory control, language acquisition, and the structure of the bilingual lexicon

    Automatic Discovery of Non-Compositional Compounds in Parallel Data

    Full text link
    Automatic segmentation of text into minimal content-bearing units is an unsolved problem even for languages like English. Spaces between words offer an easy first approximation, but this approximation is not good enough for machine translation (MT), where many word sequences are not translated word-for-word. This paper presents an efficient automatic method for discovering sequences of words that are translated as a unit. The method proceeds by comparing pairs of statistical translation models induced from parallel texts in two languages. It can discover hundreds of non-compositional compounds on each iteration, and constructs longer compounds out of shorter ones. Objective evaluation on a simple machine translation task has shown the method's potential to improve the quality of MT output. The method makes few assumptions about the data, so it can be applied to parallel data other than parallel texts, such as word spellings and pronunciations.Comment: 12 pages; uses natbib.sty, here.st

    Chengyu in Chinese Language Teaching: A preliminary analysis of Italian learners’ data

    Get PDF
    Chengyu, also known as Chinese four-character idioms, are a type of traditional Chinese idiom, mostly consisting of four characters. They commonly derive from classic Chinese literary sources, including those of the three great philosophical and religious traditions that influenced the entire East Asia cultural sphere: Confucianism, Daoism and Buddhism. Chengyu, therefore, possess a wide range of cultural references, and, from Chinese, spread to the languages of the other countries of the sinosphere, such as Japan and Korea. Although many scholars have emphasized the importance of the acquisition of chengyu, not much attention has been paid to chengyu learning in Chinese Language Teaching research so far. As a preliminary attempt to address this gap, this paper reports the results of two small-scale, exploratory experiments, aimed at investigating Italian learners’ general knowledge of chengyu and their main interpretation strategies, as well as comparing the effectiveness of direct and indirect instruction in chengyu teaching. The experiments involved participants from Bachelor and Master programs of Roma Tre University. The results show a predominant effect of negative transfer from Italian, as well as a better performance of the participants who received indirect instruction

    UmobiTalk: Ubiquitous Mobile Speech Based Learning Language Translator for Sesotho Language

    Get PDF
    Published ThesisThe need to conserve the under-resourced languages is becoming more urgent as some of them are becoming extinct; natural language processing can be used to redress this. Currently, most initiatives around language processing technologies are focusing on western languages such as English and French, yet resources for such languages are already available. The Sesotho language is one of the under-resourced Bantu languages; it is mostly spoken in Free State province of South Africa and in Lesotho. Like other parts of South Africa, Free State has experienced high number of migrants and non-Sesotho speakers from neighboring provinces and countries; such people are faced with serious language barrier problems especially in the informal settlements where everyone tends to speak only Sesotho. Non-Sesotho speakers refers to the racial groups such as Xhosas, Zulus, Coloureds, Whites and more, in which Sesotho language is not their native language. As a solution to this, we developed a parallel corpus that has English as source and Sesotho as a target language and packaged it in UmobiTalk - Ubiquitous mobile speech based learning translator. UmobiTalk is a mobile-based tool for learning Sesotho for English speakers. The development of this tool was based on the combination of automatic speech recognition, machine translation and speech synthesis

    Neurocognitive Informatics Manifesto.

    Get PDF
    Informatics studies all aspects of the structure of natural and artificial information systems. Theoretical and abstract approaches to information have made great advances, but human information processing is still unmatched in many areas, including information management, representation and understanding. Neurocognitive informatics is a new, emerging field that should help to improve the matching of artificial and natural systems, and inspire better computational algorithms to solve problems that are still beyond the reach of machines. In this position paper examples of neurocognitive inspirations and promising directions in this area are given

    PROCESSING OF COMPOUND WORDS BY ADULT KOREAN-ENGLISH BILINGUALS

    Get PDF
    The purpose of this dissertation study is to investigate how Korean-English bilinguals process compound words in both English and Korean. The major research question is: when Korean-English bilinguals process Korean or English compound words, what information is used to segment compound words into their constituents and, in particular, does morphological information play an independent role irrelevant to the form and semantic information? Four masked priming experiments were conducted with adult Korean-English bilinguals. Compound words (e.g., bedroom, deadline) and monomorphemic words with a compound-like structure (e.g., hammock) served as targets and were preceded by brief masked primes corresponding to the constituent of the target stimulus (e.g., bed, room, dead, and mock). In Experiments 1 and 2, within-language prime-target pairs (Korean-Korean for Experiment 1 and English-English for Experiment 2), co-varying morphological decomposability, semantic and form relatedness were presented. In Experiments 3 and 4, cross-language prime-target pairs (Korean-English for Experiment 3 and English-Korean for Experiment 4), varying morphological decomposability, semantic and phonological form relatedness were presented. In Experiment 1, results showed that morphological information plays a role independent of the form information when Korean-English bilinguals decompose compound words into their individual constituent morphemes in their L1 (Korean). In Experiment 2, however, there was no significant priming effect in all conditions, indicating that morphological decomposition is not relied upon in their L2 (English) processing. In Experiment 3, morphological information plays an independent role in the early stage of cross-language activation irrelevant to the semantic factor at the prime duration of 36 ms. However, morphological decomposition is constrained by semantic transparency in the later stage of cross-language activation at the prime duration of 48 ms and 100 ms. There was no significant priming effect at the two short prime durations (both 36 ms and 48 ms). However, there was a marginally significant priming effect in the +M+S-P condition at the longest prime duration (100 ms) in Experiment 3. Based on the pattern of these results, it seems that at the earlier stage of processing, phonological relatedness was important for morphological processing. In Experiment 4, there were no significant priming effects in all conditions across all of the prime durations. These findings together point to a clear asymmetry in the masked cross-language priming between L1-L2 and L2-L1 directions

    Getting Past the Language Gap: Innovations in Machine Translation

    Get PDF
    In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT

    English as the Language of Trade, Finance, and Technology in APEC: an East Asia Perspective

    Get PDF
    The use of English language for cross-border communications is important in many areas of trade ranging from tourism to trade in financial services. English will increase the capacity of people to communicate and exchange ideas and goods across borders. However, the increasing involvement in trade, tourism, and international relations among APEC member countries where English is not spoken as the first language poses some problems and barriers in achieving aspired regional cooperation. Efforts have been made by governments to encourage the internalization of English as a second language. This article documents ongoing efforts to adopt English as the official language of trade, finance, and technology in APEC member countries and to improve English fluency in selected East Asian countries. It is an interesting case study on the adoption of a common technology (i.e., English as the medium of communication) as an explicit policy to enhance both global integration and country competitiveness.capacity building, English language, language barrier, language skill, language education, English economy, trade language
    • …
    corecore