Search CORE

601 research outputs found

A Unified Multilingual Handwriting Recognition System using multigrams sub-lexical units

Author: Paquet Thierry
Soullard Yann
Swaileh Wassim
Publication venue: 'Elsevier BV'
Publication date: 28/08/2018
Field of study

We address the design of a unified multilingual system for handwriting recognition. Most of multi- lingual systems rests on specialized models that are trained on a single language and one of them is selected at test time. While some recognition systems are based on a unified optical model, dealing with a unified language model remains a major issue, as traditional language models are generally trained on corpora composed of large word lexicons per language. Here, we bring a solution by con- sidering language models based on sub-lexical units, called multigrams. Dealing with multigrams strongly reduces the lexicon size and thus decreases the language model complexity. This makes pos- sible the design of an end-to-end unified multilingual recognition system where both a single optical model and a single language model are trained on all the languages. We discuss the impact of the language unification on each model and show that our system reaches state-of-the-art methods perfor- mance with a strong reduction of the complexity.Comment: preprin

arXiv.org e-Print Archive

HAL - Normandie Université

The abc of the b and c in Spanish: inconsistent and context dependent letter errors and the development of orthographic knowledge in primary school children

Author: Acha Morcillo Joana
Rodríguez Nuria
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2022
Field of study

[EN] This study presents the results of a cross-sectional reading and spelling assessment conducted among 118 Spanish children in 3rd, 4th and 5th grade. The first aim was to explore whether children ' s use of orthographic knowledge was modulated by lexical variables-word frequency and orthographic neighborhood-or sublexical variables-context-dependent, inconsistent or neutral letters- as well as the developmental pathway of such knowledge in both tasks. The second aim was to provide insight into the type of errors committed by children in order to detect the words and structures that convey most difficulties. Data showed that children rely on sublexical processes more than on lexical ones in reading and writing. Persistent errors in context-dependent and inconsistent letters were evident even in 5th grade, and writing involved greater difficulty in all grades. The presence of other type of errors such as substitutions, omissions or lexicalizations was negligible. Finally, an item analysis revealed that errors were located in low-frequency syllables, particularly in the first position. Data point to specific and persistent difficulties in context-dependent and inconsistent letters that may hinder the consolidation of accurate orthographic word representations in Spanish.The authors acknowledge the invaluable help of all the children and families that willingly have taken part on this study. We confirm that the study has been conducted under the guidelines of the ethical committee of the University of the Basque Country UPV/EHU, project approval reference M10_2017_158, and has been partially supported by Grant Number PSI2017-86210-P

Archivo Digital para la Docencia y la Investigación

How does literacy affect speech processing? Not by enhancing cortical responses to speech, but by promoting connectivity of acoustic-phonetic and graphomotor cortices

Author: Guleria A.
Hervais-Adelman A.
Huettig F.
Kumar U.
Mishra R.
Singh J.
Tripathi V.
Publication venue: 'Society for Neuroscience'
Publication date: 01/01/2022
Field of study

Previous research suggests that literacy, specifically learning alphabetic letter-to-phoneme mappings, modifies online speech processing, and enhances brain responses, as indexed by the blood-oxygenation level dependent signal (BOLD), to speech in auditory areas associated with phonological processing (Dehaene et al., 2010). However, alphabets are not the only orthographic systems in use in the world, and hundreds of millions of individuals speak languages that are not written using alphabets. In order to make claims that literacy per se has broad and general consequences for brain responses to speech, one must seek confirmatory evidence from non-alphabetic literacy. To this end, we conducted a longitudinal fMRI study in India probing the effect of literacy in Devanagari, an abugida, on functional connectivity and cerebral responses to speech in 91 variously literate Hindi-speaking male and female human participants. Twenty-two completely illiterate participants underwent six months of reading and writing training. Devanagari literacy increases functional connectivity between acoustic-phonetic and graphomotor brain areas, but we find no evidence that literacy changes brain responses to speech, either in cross-sectional or longitudinal analyses. These findings shows that a dramatic reconfiguration of the neurofunctional substrates of online speech processing may not be a universal result of learning to read, and suggest that the influence of writing on speech processing should also be investigated

MPG.PuRe

Drawing, Handwriting Processing Analysis: New Advances and Challenges

Author: Anquetil Eric
Prevost Lionel
Rémi Céline
Publication venue: HAL CCSD
Publication date: 21/06/2015
Field of study

International audienceDrawing and handwriting are communicational skills that are fundamental in geopolitical, ideological and technological evolutions of all time. drawingand handwriting are still useful in defining innovative applications in numerous fields. In this regard, researchers have to solve new problems like those related to the manner in which drawing and handwriting become an efficient way to command various connected objects; or to validate graphomotor skills as evident and objective sources of data useful in the study of human beings, their capabilities and their limits from birth to decline

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

English speakers' common orthographic errors in Arabic as L2 writing system : an analytical case study

Author: Hisham Saleh A
Publication venue: Newcastle University
Publication date: 01/01/2015
Field of study

PhD ThesisThe research involving Arabic Writing System (WS) is quite limited. Yet, researching writing errors of L2WS Arabic against a certain L1WS seems to be relatively neglected. This study attempts to identify, describe, and explain common orthographic errors in Arabic writing amongst English-speaking learners. First, it outlines the Arabic Writing System’s (AWS) characteristics and available empirical studies of L2WS Arabic. This study embraced the Error Analysis approach, utilising a mixed-method design that deployed quantitative and qualitative tools (writing tests, questionnaire, and interview). The data were collected from several institutions around the UK, which collectively accounted for 82 questionnaire responses, 120 different writing samples from 44 intermediate learners, and six teacher interviews. The hypotheses for this research were; a) English-speaking learners of Arabic make common orthographic errors similar to those of Arabic native speakers; b) English-speaking learners share several common orthographic errors with other learners of Arabic as a second/foreign language (AFL); and c) English-speaking learners of Arabic produce their own common orthographic errors which are specifically related to the differences between the two WSs. The results confirmed all three hypotheses. Specifically, English-speaking learners of L2WS Arabic commonly made six error types: letter ductus (letter shape), orthography (spelling), phonology, letter dots, allographemes (i.e. letterform), and direction. Gemination and L1WS transfer error rates were not found to be major. Another important result showed that five letter groups in addition to two letters are particularly challenging to English-speaking learners. Study results indicated that error causes were likely to be from one of four factors: script confusion, orthographic difficulties, phonological realisation, and teaching/learning strategies. These results are generalizable as the data were collected from several institutions in different parts of the UK. Suggestions and implications as well as recommendations for further research are outlined accordingly in the conclusion chapter

Newcastle University eTheses

Recommended from our members

Codes of Modernity: Infrastructures of Language and Chinese Scripts in an Age of Global Information Revolution

Author: Kuzuoglu Ulug
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

This dissertation explores the global history of Chinese script reforms—the effort to phoneticize Chinese language and/or simplify the writing system—from its inception in the 1890s to its demise in the 1980s. These reforms took place at the intersection of industrialization, colonialism, and new information technologies, such as alphabet-based telegraphy and breakthroughs in printing technologies. As these social and technological transformations put unprecedented pressure on knowledge management and the use of mental and clerical labor, many Chinese intellectuals claimed that learning Chinese characters consumed too much time and mental energy. Chinese script reforms, this dissertation argues, were an effort to increase speed in producing, transmitting, and accessing information, and thus meet the demands of the industrializing knowledge economy. The industrializing knowledge economy that this dissertation explores was built on and sustained by a psychological understanding of the human subject as a knowledge machine, and it was part of a global moment in which the optimization of labor in knowledge production was a key concern for all modernizing economies. While Chinese intellectuals were inventing new signs of inscription, American behavioral psychologists, Soviet psycho-economists, and Central Asian and Ottoman technicians were all experimenting with new scripts in order to increase mental efficiency and productivity. This dissertation reveals the intimate connections between the Chinese and non-Chinese script engineering projects that were taking place synchronically across the world. The chapters of this work demonstrate for the first time, for instance, that the simplification of Chinese characters in the 1920s and 1930s was intimately connected to the discipline of behavioral psychology in the US. The first generation of Chinese psychologists employed the American psychologists’ methods to track eye movements, count word-frequencies, and statistically analyze the speed of reading, writing, and memorizing in order to simplify and “rationalize” the Chinese writing system in an effort to discipline and optimize mental labor. Other chapters explore the issue of mental and clerical optimization by finding the origins of the Chinese Latin Alphabet (CLA), the mother of pinyin, in hitherto unknown Eurasian connections. The CLA, the pages of this work shows, was the product of a transnational exchange that involved Ottoman and Transcaucasian typographers as well as Russian engineers and Chinese communists who sought efficiency in knowledge production through inventing new scripts. Situating the Chinese script reforms at this global intersection of psychology, economy, and linguistics, this dissertation examines the global connections and forces that turned the human subject into a knowledge worker who was cognitively managed through education, literacy, propaganda, and other measures of organizing information, all of which had the script at the center. The search for efficiency and productivity—the core values of industrialism—lay at the heart of script reforms in China, but this search was inseparable from linguistic orders and political ambitions. Even if writing, transmitting, and learning a phonetic script could theoretically be easier and more efficient than the Chinese characters, the alphabet opened a veritable Pandora’s Box around the issue of selection: given the complex linguistic landscape in China, which speech was a phonetic script supposed to represent? There were myriad languages spoken throughout the empire and the subsequent nation-state, most of which were mutually incomprehensible. Mandarin as spoken in Beijing was different from that spoken in the south, and “topolects” or regional languages such as Min or Cantonese were to Mandarin what Romanian is to English. As a linguistic life-or-death issue, phonetic scripts stood for the infrastructural possibilities and limitations in the representation of speeches. Some scripts, such as Lao Naixuan’s phonetic script composed of more than a hundred signs, were capable of representing multiple Mandarin and non-Mandarin speeches; whereas others, such as Phonetic Symbols that only has thirty-seven syllabic signs, represented only one speech, i.e., Mandarin. Using Mandarin-oriented scripts to transcribe non-Mandarin speeches was like writing English with fifteen letters, hence the acrimonious disputes that fill the pages of this dissertation. Succinctly put, it was at the level of script invention that Chinese and non-Chinese actors engineered different infrastructures not only for laboring minds but also for the social world of Chinese languages. The history of information technologies and knowledge economy in China was thus inseparable from the world of speech and language, as each script offered a new potential to reassemble the written matter and the speaking mind in a different way. “Codes of Modernity” thus conceptualizes the script itself as an infrastructural medium. A script was not merely a passive carrier of information, but an existential artifact. Building on an expanding literature on infrastructures, it endorses the observation that infrastructures, technologies, and the social world around them work in a recursive loop. An infrastructure is not just the physical object that permits the flow of information, goods, ideas, and people, but a sociotechnical product that enables the experience of culture, while imposing constrains on it at the same time. Like electricity grids, transportation systems, and sewage canals, the experience of scripts as infrastructures is the experience of thought worlds. After a long tradition of structuralism and poststructuralism that sought to understand the world through the semiotic prism of language, “Codes of Modernity” argues that it is time for an infrastructuralism that excavates the indispensable media that enable the production of language and thought

Columbia University Academic Commons

Writing Development in Struggling Learners

Author
Publication venue: 'Brill'
Publication date: 07/04/2022
Field of study

In Writing Development in Struggling Learners, international researchers provide insights into the development of writing skills from early writing and spelling development through to composition, the reasons individuals struggle to acquire proficient writing skills and how to help these learners.; Readership: Academic libraries, graduate students; post-graduate researchers; literacy researchers; educated lay persons; literacy specialists; primary/secondary educators

Directory of Open Access Books (DOAB)

Translating Sound, Then and Now: The Palaeography and Notation of Insular Song, c.1150-1300

Author: Blickhan Samantha
Publication venue
Publication date: 01/01/2016
Field of study

Royal Holloway - Pure

The Nature of Writing – A Theory of Grapholinguistics [book cover]

Author: Tullett Barrie
Publication venue: Fluxus Editions
Publication date: 01/11/2020
Field of study

Cover illustration: Purgatory: Canto VII – The Rule of the Mountain from A Typographic Dante (2008) by Barrie Tullett (also displayed in Barrie Tullett, Typewriter Art: A Modern Anthology, London: Laurence King Publishing, 2014, p. 167). With kind permission by Barrie Tullett. The text is taken from Dante. The Divine Comedy, translated by Dorothy L. Sayers, HarmondsworthMiddlesex: The Penguin Classics, 1949. On the lower part of the illustration, one can read the concluding verses of the Canto: But now the poet was going on before; “Forward!” said he; “look how the sun doth stand Meridianhigh, while on the Western shore Night sets her foot upon Morocco’s strand.

University of Lincoln Institutional Repository

Modelling multimodal language processing

Author: Smith A.
Publication venue: Radboud University Nijmegen
Publication date: 01/01/2015
Field of study

MPG.PuRe