40 research outputs found

    On writing syllabaries: Three episodes of transfer

    Get PDF
    published or submitted for publicationis peer reviewe

    The myth of normal reading

    Get PDF
    We argue that the educational and psychological sciences must embrace the diversity of reading rather than chase the phantom of normal reading behavior. We critically discuss the research practice of asking participants in experiments to read “normally”. We then draw attention to the large cross-cultural and linguistic diversity around the world and consider the enormous diversity of reading situations and goals. Finally, we observe that people bring a huge diversity of brains and experiences to the reading task. This leads to certain implications. First, there are important lessons for how to conduct psycholinguistic experiments. Second, we need to move beyond Anglo-centric reading research and produce models of reading that reflect the large cross-cultural diversity of languages and types of writing systems. Third, we must acknowledge that there are multiple ways of reading and reasons for reading, and none of them is normal or better or a “gold standard”. Finally, we must stop stigmatizing individuals who read differently and for different reasons, and there should be increased focus on teaching the ability to extract information relevant to the person’s goals. What is important is not how well people decode written language and how fast people read but what people comprehend given their own stated goals

    The Development of Writing and Preliterate Societies

    Get PDF
    Thesis advisor: Michael J. ConnollyThis paper explores the question of script choice for a preliterate society deciding to write their language down for the first time through an exposition on types of writing systems and a brief history of a few writing systems throughout the world. Societies sometimes invented new scripts, sometimes adapted existing ones, and other times used a combination of both these techniques. Based on the covered scripts ranging from Mesopotamia to Asia to Europe to the Americas, I identify factors that influence the script decision including neighboring scripts, access to technology, and the circumstances of their introduction to writing. Much of the world uses the Roman alphabet and I present the argument that almost all preliterate societies beginning to write will choose to use a version of the Roman alphabet. However, the alphabet does not fit all languages equally well, and the paper closes out with an investigation into some of these inadequacies and how languages might resolve these issues.Thesis (BA) — Boston College, 2015.Submitted to: Boston College. College of Arts and Sciences.Discipline: Departmental Honors.Discipline: Slavic and Eastern Languages and Literatures

    Investigating Multilingual, Multi-script Support in Lucene/Solr Library Applications

    Get PDF
    Yale has developed over many years a highly-structured, high-quality multilingual catalog of bibliographic data. Almost 50% of the collection represents non-English materials in over 650 languages, and includes many different non-Roman scripts. Faculty, students, researchers, and staff would like to make full use of this original script content for resource discovery. While the underlying textual data are in place, effective indexing, retrieval and display functionality for the non-Roman script content is not available within our bibliographic discovery applications, Orbis and Yufind. Opportunities now exist in the Unicode, Lucene/Solr computing environment to bridge the functionality gap and achieve internationalization of the Yale Library catalog. While most parts of this study focus on the Yale environment, in the absence of other such studies it is hoped that the findings will be of interest to a much larger community.Arcadia Foundatio

    A Large Multi-Target Dataset of Common Bengali Handwritten Graphemes

    Full text link
    Latin has historically led the state-of-the-art in handwritten optical character recognition (OCR) research. Adapting existing systems from Latin to alpha-syllabary languages is particularly challenging due to a sharp contrast between their orthographies. The segmentation of graphical constituents corresponding to characters becomes significantly hard due to a cursive writing system and frequent use of diacritics in the alpha-syllabary family of languages. We propose a labeling scheme based on graphemes (linguistic segments of word formation) that makes segmentation in-side alpha-syllabary words linear and present the first dataset of Bengali handwritten graphemes that are commonly used in an everyday context. The dataset contains 411k curated samples of 1295 unique commonly used Bengali graphemes. Additionally, the test set contains 900 uncommon Bengali graphemes for out of dictionary performance evaluation. The dataset is open-sourced as a part of a public Handwritten Grapheme Classification Challenge on Kaggle to benchmark vision algorithms for multi-target grapheme classification. The unique graphemes present in this dataset are selected based on commonality in the Google Bengali ASR corpus. From competition proceedings, we see that deep-learning methods can generalize to a large span of out of dictionary graphemes which are absent during training. Dataset and starter codes at www.kaggle.com/c/bengaliai-cv19.Comment: 15 pages, 12 figures, 6 Tables, Submitted to CVPR-2

    English Word and Pseudoword Spellings and Phonological Awareness: Detailed Comparisons From Three L1 Writing Systems

    Get PDF
    Spelling is a fundamental literacy skill facilitating word recognition and thus higher level reading abilities via its support for efficient text processing (Adams, 1990; Joshi et al., 2008; Perfetti and Stafura, 2014). However, relatively little work examines second language (L2) spelling in adults, and even less work examines learners from different first language (L1) writing systems. This is despite the fact that the influence of L1 writing system on L2 literacy skills is well documented (Hudson, 2007; Koda and Zehler, 2008; Grabe, 2009). To address this shortcoming, this study collected data on real word spelling, pseudoword spelling, and phonological awareness (elision) abilities from 70 participants (23 native speakers; 47 ELLs with alphabetic, abjad, and morphosyllabic L1s). Analyses compared performance on real word and pseudoword spelling between L1 English speakers and ELLs, and additionally among the non-native speaker L1 groups (categorized into alphabet, abjad, and morphosyllabary groups). Similar comparisons were made across groups for performance on phonological awareness. Further, correlations were calculated between phonological awareness and real word spelling and between phonological awareness and pseudoword spelling, separately for L1 English speakers and the various ESL groups. Spelling accuracy on real words and pseudowords as well as phonological awareness skill differed between native speakers and ESL speakers, and also varied by the ESL speakers’ L1 writing system. Theoretically interesting patterns emerged in the spelling data. For example, the morphosyllabic L1 speakers had strong real word spelling (better than the other ESL groups) but greatly decreased pseudoword accuracy (a drop of 59% in accuracy). Although alphabetic L1 speakers had low spelling accuracy in terms of strict scoring, they had lower rates of errors per item, highlighting the importance of scoring approach for shaping the conclusions that are drawn. Error rates also revealed vowels to be more problematic than consonants, particularly in abjad L1 speakers. The results demonstrate that L2 spelling abilities, phonological awareness, and the relationships among them vary by L1 writing system, and that differing approaches to scoring and analysis may lead to varying conclusions
    corecore