Search CORE

8 research outputs found

VERIFICATION OF GRAPHEMES USING NEURAL NETWORKS IN AN HMMBASED ONLINE KOREAN HANDWRITING RECOGNITION SYSTEM

Author: Kim J.
Kim J.H.
So S.J.
Publication venue: s.n.
Publication date: 01/01/2004
Field of study

Proceedings - University of Groningen

VERIFICATION OF GRAPHEMES USING NEURAL NETWORKS IN AN HMMBASED ONLINE KOREAN HANDWRITING RECOGNITION SYSTEM

Author: Kim J.
Kim J.H.
So S.J.
Publication venue: s.n.
Publication date: 01/01/2004
Field of study

ARTS repository - University of Groningen

Information Preserving Processing of Noisy Handwritten Document Images

Author: Chen Jin
Publication venue: Lehigh Preserve
Publication date
Field of study

Many pre-processing techniques that normalize artifacts and clean noise induce anomalies due to discretization of the document image. Important information that could be used at later stages may be lost. A proposed composite-model framework takes into account pre-printed information, user-added data, and digitization characteristics. Its benefits are demonstrated by experiments with statistically significant results. Separating pre-printed ruling lines from user-added handwriting shows how ruling lines impact people\u27s handwriting and how they can be exploited for identifying writers. Ruling line detection based on multi-line linear regression reduces the mean error of counting them from 0.10 to 0.03, 6.70 to 0.06, and 0.13 to 0.02, com- pared to an HMM-based approach on three standard test datasets, thereby reducing human correction time by 50%, 83%, and 72% on average. On 61 page images from 16 rule-form templates, the precision and recall of form cell recognition are increased by 2.7% and 3.7%, compared to a cross-matrix approach. Compensating for and exploiting ruling lines during feature extraction rather than pre-processing raises the writer identification accuracy from 61.2% to 67.7% on a 61-writer noisy Arabic dataset. Similarly, counteracting page-wise skew by subtracting it or transforming contours in a continuous coordinate system during feature extraction improves the writer identification accuracy. An implementation study of contour-hinge features reveals that utilizing the full probabilistic probability distribution function matrix improves the writer identification accuracy from 74.9% to 79.5%

Lehigh University: Lehigh Preserve

Recommended from our members

Word based off-line handwritten Arabic classification and recognition. Design of automatic recognition system for large vocabulary offline handwritten Arabic words using machine learning approaches.

Author: AlKhateeb Jawad H.Y.
Publication venue: Department of Electronic Imaging and Media Communications
Publication date: 01/01/2010
Field of study

The design of a machine which reads unconstrained words still remains an unsolved problem. For example, automatic interpretation of handwritten documents by a computer is still under research. Most systems attempt to segment words into letters and read words one character at a time. However, segmenting handwritten words is very difficult. So to avoid this words are treated as a whole. This research investigates a number of features computed from whole words for the recognition of handwritten words in particular. Arabic text classification and recognition is a complicated process compared to Latin and Chinese text recognition systems. This is due to the nature cursiveness of Arabic text. The work presented in this thesis is proposed for word based recognition of handwritten Arabic scripts. This work is divided into three main stages to provide a recognition system. The first stage is the pre-processing, which applies efficient pre-processing methods which are essential for automatic recognition of handwritten documents. In this stage, techniques for detecting baseline and segmenting words in handwritten Arabic text are presented. Then connected components are extracted, and distances between different components are analyzed. The statistical distribution of these distances is then obtained to determine an optimal threshold for word segmentation. The second stage is feature extraction. This stage makes use of the normalized images to extract features that are essential in recognizing the images. Various method of feature extraction are implemented and examined. The third and final stage is the classification. Various classifiers are used for classification such as K nearest neighbour classifier (k-NN), neural network classifier (NN), Hidden Markov models (HMMs), and the Dynamic Bayesian Network (DBN). To test this concept, the particular pattern recognition problem studied is the classification of 32492 words using ii the IFN/ENIT database. The results were promising and very encouraging in terms of improved baseline detection and word segmentation for further recognition. Moreover, several feature subsets were examined and a best recognition performance of 81.5% is achieved

Bradford Scholars

Large vocabulary off-line handwritten word recognition

Author: Koerich Alessandro L.
Publication venue: École de technologie supérieure
Publication date
Field of study

Considerable progress has been made in handwriting recognition technology over the last few years. Thus far, handwriting recognition systems have been limited to small-scale and very constrained applications where the number on different words that a system can recognize is the key point for its performance. The capability of dealing with large vocabularies, however, opens up many more applications. In order to translate the gains made by research into large and very-large vocabulary handwriting recognition, it is necessary to further improve the computational efficiency and the accuracy of the current recognition strategies and algorithms. In this thesis we focus on efficient and accurate large vocabulary handwriting recognition. The main challenge is to speedup the recognition process and to improve the recognition accuracy. However. these two aspects are in mutual conftict. It is relatively easy to improve recognition speed while trading away some accuracy. But it is much harder to improve the recognition speed while preserving the accuracy. First, several strategies have been investigated for improving the performance of a baseline recognition system in terms of recognition speed to deal with large and very-large vocabularies. Next, we improve the performance in terms of recognition accuracy while preserving all the original characteristics of the baseline recognition system: omniwriter, unconstrained handwriting, and dynamic lexicons. The main contributions of this thesis are novel search strategies and a novel verification approach that allow us to achieve a 120 speedup and 10% accuracy improvement over a state-of-art baselinè recognition system for a very-large vocabulary recognition task (80,000 words). The improvements in speed are obtained by the following techniques: lexical tree search, standard and constrained lexicon-driven level building algorithms, fast two-level decoding algorithm, and a distributed recognition scheme. The recognition accuracy is improved by post-processing the list of the candidate N-best-scoring word hypotheses generated by the baseline recognition system. The list also contains the segmentation of such word hypotheses into characters . A verification module based on a neural network classifier is used to generate a score for each segmented character and in the end, the scores from the baseline recognition system and the verification module are combined to optimize performance. A rejection mechanism is introduced over the combination of the baseline recognition system with the verification module to improve significantly the word recognition rate to about 95% while rejecting 30% of the word hypotheses

Espace ÉTS

Multimedia Development of English Vocabulary Learning in Primary School

Author: Syaiful Rohim Aim
Publication venue: ICCE 2014 Organizing Committee, Japan
Publication date: 01/01/2014
Field of study

In this paper, we describe a prototype of web-based intelligent handwriting education system for autonomous learning of Bengali characters. Bengali language is used by more than 211 million people of India and Bangladesh. Due to the socio-economical limitation, all of the population does not have the chance to go to school. This research project was aimed to develop an intelligent Bengali handwriting education system. As an intelligent tutor, the system can automatically check the handwriting errors, such as stroke production errors, stroke sequence errors, stroke relationship errors and immediately provide a feedback to the students to correct themselves. Our proposed system can be accessed from smartphone or iPhone that allows students to do practice their Bengali handwriting at anytime and anywhere. Bengali is a multi-stroke input characters with extremely long cursive shaped where it has stroke order variability and stroke direction variability. Due to this structural limitation, recognition speed is a crucial issue to apply traditional online handwriting recognition algorithm for Bengali language learning. In this work, we have adopted hierarchical recognition approach to improve the recognition speed that makes our system adaptable for web-based language learning. We applied writing speed free recognition methodology together with hierarchical recognition algorithm. It ensured the learning of all aged population, especially for children and older national. The experimental results showed that our proposed hierarchical recognition algorithm can provide higher accuracy than traditional multi-stroke recognition algorithm with more writing variability

UHAMKA Repository

Linguistics in East Asia and South East Asia

Author: Noss Richard B.
Yamagiwa Joseph. K.
Yuen Ren Chao
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 27/06/2019
Field of study

Directory of Open Access Books (DOAB)

Digital Research Cycles: How Attitudes Toward Content, Culture And Technology Affect Web Development.

Author: Scott Edward
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2009
Field of study

It has been estimated that one third of the world\u27s population does not have access to adequate health care. Some 1.6 billion people live in countries experiencing concentrated acquired immune deficiency syndrome (AIDS) epidemics. Many countries in Africa--and other low-income countries--are in dire need of help providing adequate health care services to their citizens. They require more hands-on care from Western health workers--and training so more African health workers can eventually care for their own citizens. But these countries also need assistance acquiring and implementing both texts--the body of medical information potentially available to them--and technology--the means by which that information can be conveyed. This dissertation looks at these issues and others from a multi-faceted approach. It combines a survey of the developers of Web sites designed for use by health workers in low-income countries and a proposal for a novel approach to communication theory, which could help improve health communication and other social marketing practices. It also includes an extensive review of literature regarding a number of topics related to these issues. To improve healthcare services in low-income countries, several things should occur. First, more health workers--and others--could visit African countries and other places to provide free, hands-on medical care, as this researcher\u27s group did in Uganda. Such trips are ideal occasions for studying the cultural differences between mzungu (white man) and the Ugandan people. A number of useful medical texts have been written for health workers in low-income countries. Others will be published as new health information becomes available. But on what medium will they be published? Computers? Personal digital assistants? During the past 10 years the Internet became an ideal venue for conveying information. Unfortunately, people in target countries such as Uganda encounter cultural differences when such new technologies are diffused. This dissertation looks at cultural and technological difficulties encountered by people in low-income countries who attempt to diffuse information and communication technologies (ICT). Once a technology has been successfully adopted, someone will look for ways to use it to help others. There are hundreds of sites on the Internet--built by Web developers in Western countries--that are designed for use by health workers in low-income countries. However, these Web developers also experience cultural and technological differences, based on their knowledge of and attitudes toward best practices in their field. This research includes a survey of Web developers which determined their attitudes toward best practices in their field and tested this researcher\u27s hypothesis that there is no significant difference among the developers\u27 attitudes toward the content on their sites, their audience\u27s cultural needs and the various technological needs their audience has. It was found that the Web developers agree with 17 of 18 perceived best practices and that there is a significant difference between Web developers\u27 attitudes toward their audience\u27s technological needs and their attitudes toward quality content and the audience\u27s cultural needs. Creation of the survey herein resulted in this researcher generating a new way of thinking about communication theory--called digital research cycles. The survey was based on a review of literature and is rooted in the belief that any successful communication of a computer-mediated message in the information age is a behavior which is influenced by the senders\u27 and receivers\u27 attitudes and knowledge about textual style, the audience, technology and the subject matter to which the message pertains

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)