204 research outputs found

    An Open Source Testing Tool for Evaluating Handwriting Input Methods

    Full text link
    This paper presents an open source tool for testing the recognition accuracy of Chinese handwriting input methods. The tool consists of two modules, namely the PC and Android mobile client. The PC client reads handwritten samples in the computer, and transfers them individually to the Android client in accordance with the socket communication protocol. After the Android client receives the data, it simulates the handwriting on screen of client device, and triggers the corresponding handwriting recognition method. The recognition accuracy is recorded by the Android client. We present the design principles and describe the implementation of the test platform. We construct several test datasets for evaluating different handwriting recognition systems, and conduct an objective and comprehensive test using six Chinese handwriting input methods with five datasets. The test results for the recognition accuracy are then compared and analyzed.Comment: 5 pages, 3 figures, 11 tables. Accepted to appear at ICDAR 201

    Open Set Chinese Character Recognition using Multi-typed Attributes

    Get PDF
    Recognition of Off-line Chinese characters is still a challenging problem, especially in historical documents, not only in the number of classes extremely large in comparison to contemporary image retrieval methods, but also new unseen classes can be expected under open learning conditions (even for CNN). Chinese character recognition with zero or a few training samples is a difficult problem and has not been studied yet. In this paper, we propose a new Chinese character recognition method by multi-type attributes, which are based on pronunciation, structure and radicals of Chinese characters, applied to character recognition in historical books. This intermediate attribute code has a strong advantage over the common `one-hot' class representation because it allows for understanding complex and unseen patterns symbolically using attributes. First, each character is represented by four groups of attribute types to cover a wide range of character possibilities: Pinyin label, layout structure, number of strokes, three different input methods such as Cangjie, Zhengma and Wubi, as well as a four-corner encoding method. A convolutional neural network (CNN) is trained to learn these attributes. Subsequently, characters can be easily recognized by these attributes using a distance metric and a complete lexicon that is encoded in attribute space. We evaluate the proposed method on two open data sets: printed Chinese character recognition for zero-shot learning, historical characters for few-shot learning and a closed set: handwritten Chinese characters. Experimental results show a good general classification of seen classes but also a very promising generalization ability to unseen characters.Comment: 29 pages, submitted to Pattern Recognitio

    Real-time Online Chinese Character Recognition

    Get PDF
    In this project, I built a web application for handwritten Chinese characters recognition in real time. This system determines a Chinese character while a user is drawing/writing it. The techniques and steps I use to build the recognition system include data preparation, preprocessing, features extraction, and classification. To increase the accuracy, two different types of neural networks ared used in the system: a multi-layer neural network and a convolutional neural network

    Chinese writing composition among CFL learners: a comparison between handwriting and typewriting

    Get PDF
    Situated in the context of CFL (Chinese as a foreign language), the current study examines and compares texts produced by twelve pre-intermediate CFL learners using both pen-and-paper and the pinyin input system. The participants were also invited for interviews to investigate their attitudes towards handwriting and typewriting. Because of the ease of use of the pinyin input system, CFL learners tend to prefer it over writing by hand when composing lengthy texts. Based on the evaluations of fifteen professional CFL teachers, the typewritten texts were rated higher than the handwritten ones. Using the self-report empathy test, there was no significant correlation between an evaluator’s empathy and his/her rating for the texts, whether composed by hand or with pinyin input. Pedagogically, typewriting might better assist Chinese language learning after handwriting has been introduced and practised among non-beginner CFL learners. The empathy effect on handwriting reported in previous literature is not found in the study. The study goes beyond the factors influencing typewriting and typewritten essays, to encourage future research investigating when to introduce computer-based writing and how it would best assist in language learning

    The Influence of Technology to Hand Writing Chinese Character Ability

    Get PDF
    Nowadays students mostly used a pinyin input method to study Chinese, which had adverse effects on their Chinese character writing skills. The research method using qualitative method, where the researcher gave two types of tasks to 58 students, some of whom used the pinyin input method and some who did not. According to the findings, there are 67.25% of the students (38 out of 58) making errors when writing by hand. These errors included 47.44% of errors related to similarities between Chinese characters, which could be categorized into three types: word formation similarities (5%), character component similarities (15.25%), and single letter similarities (27.11%). Additionally, 25% of the errors were sound equation errors, such as “�” as “_f” (“shì”). Furthermore, 27% of the errors were related to the formation of new Chinese characters. When the students used the pinyin input method, the percentage of errors decreased significantly to 25.86%, most of the errors made using this method were due to word selection errors (60%) and vowel errors (40%). The research suggests that using the pinyin input method may have a negative impact on students’ motivation to write Chinese characters, which, in turn, can affect their handwriting skills

    The Challenges of Recognizing Offline Handwritten Chinese: A Technical Review

    Get PDF
    Offline handwritten Chinese recognition is an important research area of pattern recognition, including offline handwritten Chinese character recognition (offline HCCR) and offline handwritten Chinese text recognition (offline HCTR), which are closely related to daily life. With new deep learning techniques and the combination with other domain knowledge, offline handwritten Chinese recognition has gained breakthroughs in methods and performance in recent years. However, there have yet to be articles that provide a technical review of this field since 2016. In light of this, this paper reviews the research progress and challenges of offline handwritten Chinese recognition based on traditional techniques, deep learning methods, methods combining deep learning with traditional techniques, and knowledge from other areas from 2016 to 2022. Firstly, it introduces the research background and status of handwritten Chinese recognition, standard datasets, and evaluation metrics. Secondly, a comprehensive summary and analysis of offline HCCR and offline HCTR approaches during the last seven years is provided, along with an explanation of their concepts, specifics, and performances. Finally, the main research problems in this field over the past few years are presented. The challenges still exist in offline handwritten Chinese recognition are discussed, aiming to inspire future research work

    Radical Recognition in Off-Line Handwritten Chinese Characters Using Non-Negative Matrix Factorization

    Get PDF
    In the past decade, handwritten Chinese character recognition has received renewed interest with the emergence of touch screen devices. Other popular applications include on-line Chinese character dictionary look-up and visual translation in mobile phone applications. Due to the complex structure of Chinese characters, this classification task is not exactly an easy one, as it involves knowledge from mathematics, computer science, and linguistics. Given a large image database of handwritten character data, the goal of my senior project is to use Non-Negative Matrix Factorization (NMF), a recent method for finding a suitable representation (parts-based representation) of image data, to detect specific sub-components in Chinese characters. NMF has only been applied to typed (printed) Chinese characters in different fonts. This project focuses specifically on how well NMF works on handwritten characters. In addition, research in Chinese character classification has mainly been done using holistic approaches - treating each character as an inseparable unit. By using NMF, this project takes a different approach by focusing on a more specific problem in Chinese character classification: radical (sub-component) detection. Finally, a possible application of radical detection will be proposed. This interactive application can potentially help Chinese language learners better recognize characters by radicals

    A Longitudinal Analysis of the Development of Mandarin Chinese in Fourth Grade Chinese Immersion

    Full text link
    Many studies have confirmed the benefits of dual language immersion programs. Research into reading and writing development in these programs, and particularly in Chinese immersion, is less common. In this dissertation, an attempt is made to address this gap in research by exploring the literacy development of fourth grade Chinese immersion students. Participants were 70 students, the entire fourth grade of an urban Chinese immersion school in the northeastern U.S. The school had recently made several curricular changes. They were adopting a practice of freewriting, or independent writing. In freewriting, students are encouraged to write as much as they can on a topic using all of their linguistic and meaning-making resources without regard for accuracy. They learn to write for self-expression and for readers (as opposed to writing for feedback). The school, in addition, adopted the Level Chinese reading system as part of an effort to systematize reading instruction and assessment. Lastly, they were actively considering ways to support student writing development through digital technologies. The school also administered annual year-end STAMP 4Se standardized tests of Chinese. The current studies aimed to understand effects of and relations between these curricular approaches. The first study in this dissertation aimed to understand how digital writing using Pinyin input might support development of literacy skills in Chinese immersion. In this study, the effects of a digital text messaging curriculum on freewriting were investigated. It was hypothesized that use of digital Pinyin input would facilitate connections between oral and written language by allowing learners to access vocabulary they could not yet write by hand but could type using Pinyin on an alphabetic keyboard. Students in two classes engaged in text messaging in small groups using digital Pinyin input in online chatrooms for 20 minutes, 3 times per week over an 8-week period. A matched group of students in other classes taught by the same teachers completed regular pencil-and-paper word work that focused on analysis of characters during the same time period. Texting with classmates using Pinyin input, when replacing multi-component word work, was negatively associated with freewriting output, that is, students who completed word work did better in freewriting post-texting intervention. Within texting groups, however, children who were successful at texting showed greater gains in freewriting abilities as compared to children with lesser success at texting. Given the importance of digital writing and online learning, the findings indicate that texting should supplement, but not replace multi-component word work. The second study reported in this dissertation built on the first study by investigating the development of writing, reading, and proficiency in L2 Chinese across the entire school year through a focus on freewriting. Our aim was to better understand how students use Chinese and all of their meaning-making resources in writing, and the relationship between student writing, reading and proficiency. First, student freewrites, that were collected at 3 time points over the school year, were examined to understand how students deployed their linguistic and meaning-making resources in writing. Students used a combination of correct characters and words written in Pinyin, homophones, English and pictures to fulfill their meaning-making needs in the moment. Proportions of words written in correct Chinese characters increased from 63% to 81% over successive freewrites. Writing ability grew over time, as assessed by diversity of vocabulary in freewrites. Reading ability as assessed by teachers using the Level Chinese system also grew. Lastly, we examined relations between classroom measures of writing and reading, participation in the texting curriculum, and language proficiency as measured by end-of-year 4Se standardized assessments of Chinese in the domains of reading, writing, listening and speaking. Classroom measures of reading predicted proficiency across the four domains of reading, writing, listening and speaking, while freewriting also predicted reading and writing proficiency. Students in the texting classes had higher proficiency in speaking, suggesting that digital interaction with peers supported oral communication. Pedagogical implications of the findings will be shared and discussed
    corecore