8 research outputs found

    Information Preserving Processing of Noisy Handwritten Document Images

    Get PDF
    Many pre-processing techniques that normalize artifacts and clean noise induce anomalies due to discretization of the document image. Important information that could be used at later stages may be lost. A proposed composite-model framework takes into account pre-printed information, user-added data, and digitization characteristics. Its benefits are demonstrated by experiments with statistically significant results. Separating pre-printed ruling lines from user-added handwriting shows how ruling lines impact people\u27s handwriting and how they can be exploited for identifying writers. Ruling line detection based on multi-line linear regression reduces the mean error of counting them from 0.10 to 0.03, 6.70 to 0.06, and 0.13 to 0.02, com- pared to an HMM-based approach on three standard test datasets, thereby reducing human correction time by 50%, 83%, and 72% on average. On 61 page images from 16 rule-form templates, the precision and recall of form cell recognition are increased by 2.7% and 3.7%, compared to a cross-matrix approach. Compensating for and exploiting ruling lines during feature extraction rather than pre-processing raises the writer identification accuracy from 61.2% to 67.7% on a 61-writer noisy Arabic dataset. Similarly, counteracting page-wise skew by subtracting it or transforming contours in a continuous coordinate system during feature extraction improves the writer identification accuracy. An implementation study of contour-hinge features reveals that utilizing the full probabilistic probability distribution function matrix improves the writer identification accuracy from 74.9% to 79.5%

    Large vocabulary off-line handwritten word recognition

    Get PDF
    Considerable progress has been made in handwriting recognition technology over the last few years. Thus far, handwriting recognition systems have been limited to small-scale and very constrained applications where the number on different words that a system can recognize is the key point for its performance. The capability of dealing with large vocabularies, however, opens up many more applications. In order to translate the gains made by research into large and very-large vocabulary handwriting recognition, it is necessary to further improve the computational efficiency and the accuracy of the current recognition strategies and algorithms. In this thesis we focus on efficient and accurate large vocabulary handwriting recognition. The main challenge is to speedup the recognition process and to improve the recognition accuracy. However. these two aspects are in mutual conftict. It is relatively easy to improve recognition speed while trading away some accuracy. But it is much harder to improve the recognition speed while preserving the accuracy. First, several strategies have been investigated for improving the performance of a baseline recognition system in terms of recognition speed to deal with large and very-large vocabularies. Next, we improve the performance in terms of recognition accuracy while preserving all the original characteristics of the baseline recognition system: omniwriter, unconstrained handwriting, and dynamic lexicons. The main contributions of this thesis are novel search strategies and a novel verification approach that allow us to achieve a 120 speedup and 10% accuracy improvement over a state-of-art baselinè recognition system for a very-large vocabulary recognition task (80,000 words). The improvements in speed are obtained by the following techniques: lexical tree search, standard and constrained lexicon-driven level building algorithms, fast two-level decoding algorithm, and a distributed recognition scheme. The recognition accuracy is improved by post-processing the list of the candidate N-best-scoring word hypotheses generated by the baseline recognition system. The list also contains the segmentation of such word hypotheses into characters . A verification module based on a neural network classifier is used to generate a score for each segmented character and in the end, the scores from the baseline recognition system and the verification module are combined to optimize performance. A rejection mechanism is introduced over the combination of the baseline recognition system with the verification module to improve significantly the word recognition rate to about 95% while rejecting 30% of the word hypotheses

    Multimedia Development of English Vocabulary Learning in Primary School

    Get PDF
    In this paper, we describe a prototype of web-based intelligent handwriting education system for autonomous learning of Bengali characters. Bengali language is used by more than 211 million people of India and Bangladesh. Due to the socio-economical limitation, all of the population does not have the chance to go to school. This research project was aimed to develop an intelligent Bengali handwriting education system. As an intelligent tutor, the system can automatically check the handwriting errors, such as stroke production errors, stroke sequence errors, stroke relationship errors and immediately provide a feedback to the students to correct themselves. Our proposed system can be accessed from smartphone or iPhone that allows students to do practice their Bengali handwriting at anytime and anywhere. Bengali is a multi-stroke input characters with extremely long cursive shaped where it has stroke order variability and stroke direction variability. Due to this structural limitation, recognition speed is a crucial issue to apply traditional online handwriting recognition algorithm for Bengali language learning. In this work, we have adopted hierarchical recognition approach to improve the recognition speed that makes our system adaptable for web-based language learning. We applied writing speed free recognition methodology together with hierarchical recognition algorithm. It ensured the learning of all aged population, especially for children and older national. The experimental results showed that our proposed hierarchical recognition algorithm can provide higher accuracy than traditional multi-stroke recognition algorithm with more writing variability

    Digital Research Cycles: How Attitudes Toward Content, Culture And Technology Affect Web Development.

    Get PDF
    It has been estimated that one third of the world\u27s population does not have access to adequate health care. Some 1.6 billion people live in countries experiencing concentrated acquired immune deficiency syndrome (AIDS) epidemics. Many countries in Africa--and other low-income countries--are in dire need of help providing adequate health care services to their citizens. They require more hands-on care from Western health workers--and training so more African health workers can eventually care for their own citizens. But these countries also need assistance acquiring and implementing both texts--the body of medical information potentially available to them--and technology--the means by which that information can be conveyed. This dissertation looks at these issues and others from a multi-faceted approach. It combines a survey of the developers of Web sites designed for use by health workers in low-income countries and a proposal for a novel approach to communication theory, which could help improve health communication and other social marketing practices. It also includes an extensive review of literature regarding a number of topics related to these issues. To improve healthcare services in low-income countries, several things should occur. First, more health workers--and others--could visit African countries and other places to provide free, hands-on medical care, as this researcher\u27s group did in Uganda. Such trips are ideal occasions for studying the cultural differences between mzungu (white man) and the Ugandan people. A number of useful medical texts have been written for health workers in low-income countries. Others will be published as new health information becomes available. But on what medium will they be published? Computers? Personal digital assistants? During the past 10 years the Internet became an ideal venue for conveying information. Unfortunately, people in target countries such as Uganda encounter cultural differences when such new technologies are diffused. This dissertation looks at cultural and technological difficulties encountered by people in low-income countries who attempt to diffuse information and communication technologies (ICT). Once a technology has been successfully adopted, someone will look for ways to use it to help others. There are hundreds of sites on the Internet--built by Web developers in Western countries--that are designed for use by health workers in low-income countries. However, these Web developers also experience cultural and technological differences, based on their knowledge of and attitudes toward best practices in their field. This research includes a survey of Web developers which determined their attitudes toward best practices in their field and tested this researcher\u27s hypothesis that there is no significant difference among the developers\u27 attitudes toward the content on their sites, their audience\u27s cultural needs and the various technological needs their audience has. It was found that the Web developers agree with 17 of 18 perceived best practices and that there is a significant difference between Web developers\u27 attitudes toward their audience\u27s technological needs and their attitudes toward quality content and the audience\u27s cultural needs. Creation of the survey herein resulted in this researcher generating a new way of thinking about communication theory--called digital research cycles. The survey was based on a review of literature and is rooted in the belief that any successful communication of a computer-mediated message in the information age is a behavior which is influenced by the senders\u27 and receivers\u27 attitudes and knowledge about textual style, the audience, technology and the subject matter to which the message pertains
    corecore