1,335 research outputs found

    Mostly-Unsupervised Statistical Segmentation of Japanese Kanji Sequences

    Full text link
    Given the lack of word delimiters in written Japanese, word segmentation is generally considered a crucial first step in processing Japanese texts. Typical Japanese segmentation algorithms rely either on a lexicon and syntactic analysis or on pre-segmented data; but these are labor-intensive, and the lexico-syntactic techniques are vulnerable to the unknown word problem. In contrast, we introduce a novel, more robust statistical method utilizing unsegmented training data. Despite its simplicity, the algorithm yields performance on long kanji sequences comparable to and sometimes surpassing that of state-of-the-art morphological analyzers over a variety of error metrics. The algorithm also outperforms another mostly-unsupervised statistical algorithm previously proposed for Chinese. Additionally, we present a two-level annotation scheme for Japanese to incorporate multiple segmentation granularities, and introduce two novel evaluation metrics, both based on the notion of a compatible bracket, that can account for multiple granularities simultaneously.Comment: 22 pages. To appear in Natural Language Engineerin

    THE EFFECTS OF ONLINE KATAKANA WORD RECOGNITION TRAINING AMONG NOVICE LEARNERS OF JAPANESE AS A FOREIGN LANGUAGE

    Get PDF
    Because word recognition processes differ depending on orthographic systems, second language learners with different orthographic backgrounds need to acquire new word recognition strategies suitable to the orthography in their second language. Japanese is a multi-script language and one of the scripts, katakana, is mainly used to transcribe Western loanwords. Due to the sound alternations resulting from the process of borrowing, learners of Japanese often experience difficulties in reading and writing katakana loanwords. Thus, this study investigates the effectiveness of online katakana word recognition training among novice learners of Japanese. Thirty-one students from a first-semester Japanese course at a large research university in the Midwest were randomly divided into three groups and assigned different online training programs outside of the class for four weeks designed to establish sound-letter correspondences of katakana. The first experimental group (Scrambler Group) put the randomly scrambled letters in the right order to form a target katakana loanword by listening to the vocalized word, while the second experimental group (Reading Group) practiced with the same set of the words solely by enunciating and listening to the model reading. The participants took pre- and post-tests before and after the training so that the improvement resulting from the training was observed. The test was composed of two tasks, naming and providing the English meanings of katakana words. The number of correct answers was counted and the response time for a participant to process each word was measured. The test included words practiced in the training and unpracticed words in order to test whether the training effects was transferred to processing unpracticed words

    Cortical Responses to Familiar and Novel Orthographic Systems

    Get PDF
    The Visual Word Form Area is a portion of the occipitotemporal cortex which has been shown to respond specifically to visually presented words, leading to it being implicated as a significant region in the process of reading. The VWFA seems to display a great deal of plasticity, as the ability to read has been proposed to be based on a functional reorganization of this area during the process of learning to read and becoming attenuated to language specific word-formation regularities. The effect of familiarity with an orthographic system and the way in which it modulates the N170 ERP response originating in the Visual Word-Form Area is still largely uncertain. Previous research by Maurer et al. (2008) has demonstrated a left-lateralization for familiar orthographies which is absent in novel orthographies which tend to demonstrate either a lack of lateralization in this response, or a slight right-lateralization. Based on Maurer et al. (2008), we have conducted a study which built upon their approach but adjusted their methodology and stimuli in several ways. Firstly, a single experiment was designed, including all 3 language conditions of interest: English, Japanese Hiragana, and a non-linguistic symbol set. The experiment was further randomized across all three conditions rather than presented in block format. This allowed for the direct comparison of language conditions for participants within the same experiment, allowing comparisons across conditions tested within the same experimental context with the same participants. In addition, our study included tighter controls for word length, bigram frequency, character size and spacing to further ensure the veracity of our data. Our results confirm the left-lateralization observed for familiar language conditions, but also demonstrate an amplitude modulation of the N170 response for familiarity, in which novel orthographies create a more negative response than familiar orthographies in the N170 time window. This pattern was later reversed in subsequent time windows as lexical processes were engaged, prompting a much more negative response for familiar orthographic conditions over novel ones. This indicates that the amplitude of the N170 response is directly affected by experience with orthographic systems.

    The time course of brain activity in reading identical cognates: An ERP study of Chinese - Japanese bilinguals

    Get PDF
    Previous studies suggest that bilinguals' lexical access is language non-selective, especially for orthographically identical translation equivalents across languages (i.e., identical cognates). The present study investigated how such words (e.g., meaning "school" in both Chinese and Japanese) are processed in the (late) Chinese - Japanese bilingual brain. Using an L2-Japanese lexical decision task, both behavioral and electrophysiological data were collected. Reaction times (RTs), as well as the N400 component, showed that cognates are more easily recognized than non-cognates. Additionally, an early component (i.e., the N250), potentially reflecting activation at the word-form level, was also found. Cognates elicited a more positive N250 than non-cognates in the frontal region, indicating that the cognate facilitation effect occurred at an early stage of word formation for languages with logographic scripts

    Mora or more? The phonological unit of Japanese word production in the Stroop color naming task

    Get PDF
    In English, Dutch, and other European languages, it is well established that the fundamental phonological unit in word production is the phoneme; in contrast, recent studies have shown that in Chinese it is the (atonal) syllable and in Japanese the mora. The present study investigated whether this cross-language variation in the size of the unit of word production is due to the type of script used in the language (i.e., alphabetic, morphosyllabic, or moraic). Capitalizing on the multiscriptal nature of Japanese, and using the Stroop color naming task, we show that the overlap in the initial mora between the color name and the written distractor facilitates color naming independent of script type. These results confirm the mora as the phonological unit of word production in Japanese, and establish the Stroop color naming task as a useful task for investigating the fundamental (or "proximate") phonological unit used in speech production

    (Dis)connections between specific language impairment and dyslexia in Chinese

    Get PDF
    Poster Session: no. 26P.40Specific language impairment (SLI) and dyslexia describe language-learning impairments that occur in the absence of a sensory, cognitive, or psychosocial impairment. SLI is primarily defined by an impairment in oral language, and dyslexia by a deficit in the reading of written words. SLI and dyslexia co-occur in school-age children learning English, with rates ranging from 17% to 75%. For children learning Chinese, SLI and dyslexia also co-occur. Wong et al. (2010) first reported on the presence of dyslexia in a clinical sample of 6- to 11-year-old school-age children with SLI. The study compared the reading-related cognitive skills of children with SLI and dyslexia (SLI-D) with 2 groups of children …postprin

    Uncovering the myth of learning to read Chinese characters: phonetic, semantic, and orthographic strategies used by Chinese as foreign language learners

    Get PDF
    Oral Session - 6A: Lexical modeling: no. 6A.3Chinese is considered to be one of the most challenging orthographies to be learned by non-native speakers, in particular, the character. Chinese character is the basic reading unit that converges sound, form and meaning. The predominant type of Chinese character is semantic-phonetic compound that is composed of phonetic and semantic radicals, giving the clues of the sound and meaning, respectively. Over the last two decades, psycholinguistic research has made significant progress in specifying the roles of phonetic and semantic radicals in character processing among native Chinese speakers …postprin

    Word processing in languages using non-alphabetic scripts: The cases of Japanese and Chinese

    Get PDF
    This thesis investigates the processing of words written in Japanese kanji and Chinese hànzì, i.e. logographic scripts. Special attention is given to the fact that the majority of Japanese kanji have multiple pronunciations (generally depending on the combination a kanji forms with other characters). First, using masked priming, it is established that upon presentation of a Japanese kanji multiple pronunciations are activated. In subsequent experiments using word naming with context pictures it is concluded that both Chinese hànzì and Japanese kanji are read out loud via a direct route from orthography to phonology. However, only Japanese kanji become susceptible to semantic or phonological context effects as a result of a cost due to the processing of multiple pronunciations. Finally, zooming in on the size of the articulatory planning unit in Japanese it is concluded that the mora as a phonological unit best complies with the observed data pattern and not the phoneme or the syllabl

    Second language acquisition of Japanese orthography

    Get PDF
    corecore