758 research outputs found

    Template Based Recognition of On-Line Handwriting

    Get PDF
    Software for recognition of handwriting has been available for several decades now and research on the subject have produced several different strategies for producing competitive recognition accuracies, especially in the case of isolated single characters. The problem of recognizing samples of handwriting with arbitrary connections between constituent characters (emph{unconstrained handwriting}) adds considerable complexity in form of the segmentation problem. In other words a recognition system, not constrained to the isolated single character case, needs to be able to recognize where in the sample one letter ends and another begins. In the research community and probably also in commercial systems the most common technique for recognizing unconstrained handwriting compromise Neural Networks for partial character matching along with Hidden Markov Modeling for combining partial results to string hypothesis. Neural Networks are often favored by the research community since the recognition functions are more or less automatically inferred from a training set of handwritten samples. From a commercial perspective a downside to this property is the lack of control, since there is no explicit information on the types of samples that can be correctly recognized by the system. In a template based system, each style of writing a particular character is explicitly modeled, and thus provides some intuition regarding the types of errors (confusions) that the system is prone to make. Most template based recognition methods today only work for the isolated single character recognition problem and extensions to unconstrained recognition is usually not straightforward. This thesis presents a step-by-step recipe for producing a template based recognition system which extends naturally to unconstrained handwriting recognition through simple graph techniques. A system based on this construction has been implemented and tested for the difficult case of unconstrained online Arabic handwriting recognition with good results

    Use of prior knowledge in classification of similar and structured objects

    Get PDF
    Statistical machine learning has achieved great success in many fields in the last few decades. However, there remain classification problems that computers still struggle to match human performance. Many such problems share the same properties---large within class variability and complex structure in the examples, which is often true for real world objects. This does not mean lack of information for classification in the examples. On the contrary, there is still a clear pattern in the examples, but hidden behind a many-way covariance structure such that useful information is too dilute for conventional statistical machine learners to pick up. However, if we can exploit the structural nature of the objects and concentrate information about the classification, the problem can become much easier. In this dissertation we propose a framework using prior knowledge about modeling the structures in the examples to concentrate information for classification. The framework is instantiated to the task of classifying pairs of similar offline handwritten Chinese characters. We empirically demonstrate that our proposed framework indeed concentrates useful information for classification and makes the classification problem easier for statistical learning. Our approach advances the state of the art both in offline handwritten character recognition and in machine learning

    Content Recognition and Context Modeling for Document Analysis and Retrieval

    Get PDF
    The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval. Meanwhile, the processing of documents with unconstrained layouts and complex formatting often requires effective leveraging of broad contextual knowledge. In this dissertation, we first present a novel approach for document image content categorization, using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant local shape feature that is generic enough to be detected repeatably and is segmentation free. A concise, structurally indexed shape lexicon is learned by clustering and partitioning feature types through graph cuts. Our idea finds successful application in several challenging tasks, including content recognition of diverse web images and language identification on documents composed of mixed machine printed text and handwriting. Second, we address two fundamental problems in signature-based document image retrieval. Facing continually increasing volumes of documents, detecting and recognizing unique, evidentiary visual entities (\eg, signatures and logos) provides a practical and reliable supplement to the OCR recognition of printed text. We propose a novel multi-scale framework to detect and segment signatures jointly from document images, based on the structural saliency under a signature production model. We formulate the problem of signature retrieval in the unconstrained setting of geometry-invariant deformable shape matching and demonstrate state-of-the-art performance in signature matching and verification. Third, we present a model-based approach for extracting relevant named entities from unstructured documents. In a wide range of applications that require structured information from diverse, unstructured document images, processing OCR text does not give satisfactory results due to the absence of linguistic context. Our approach enables learning of inference rules collectively based on contextual information from both page layout and text features. Finally, we demonstrate the importance of mining general web user behavior data for improving document ranking and other web search experience. The context of web user activities reveals their preferences and intents, and we emphasize the analysis of individual user sessions for creating aggregate models. We introduce a novel algorithm for estimating web page and web site importance, and discuss its theoretical foundation based on an intentional surfer model. We demonstrate that our approach significantly improves large-scale document retrieval performance

    Adaptive Algorithms for Automated Processing of Document Images

    Get PDF
    Large scale document digitization projects continue to motivate interesting document understanding technologies such as script and language identification, page classification, segmentation and enhancement. Typically, however, solutions are still limited to narrow domains or regular formats such as books, forms, articles or letters and operate best on clean documents scanned in a controlled environment. More general collections of heterogeneous documents challenge the basic assumptions of state-of-the-art technology regarding quality, script, content and layout. Our work explores the use of adaptive algorithms for the automated analysis of noisy and complex document collections. We first propose, implement and evaluate an adaptive clutter detection and removal technique for complex binary documents. Our distance transform based technique aims to remove irregular and independent unwanted foreground content while leaving text content untouched. The novelty of this approach is in its determination of best approximation to clutter-content boundary with text like structures. Second, we describe a page segmentation technique called Voronoi++ for complex layouts which builds upon the state-of-the-art method proposed by Kise [Kise1999]. Our approach does not assume structured text zones and is designed to handle multi-lingual text in both handwritten and printed form. Voronoi++ is a dynamically adaptive and contextually aware approach that considers components' separation features combined with Docstrum [O'Gorman1993] based angular and neighborhood features to form provisional zone hypotheses. These provisional zones are then verified based on the context built from local separation and high-level content features. Finally, our research proposes a generic model to segment and to recognize characters for any complex syllabic or non-syllabic script, using font-models. This concept is based on the fact that font files contain all the information necessary to render text and thus a model for how to decompose them. Instead of script-specific routines, this work is a step towards a generic character and recognition scheme for both Latin and non-Latin scripts

    Feature Extraction Methods for Character Recognition

    Get PDF
    Not Include

    Four cornered code based Chinese character recognition system.

    Get PDF
    by Tham Yiu-Man.Thesis (M.Phil.)--Chinese University of Hong Kong, 1993.Includes bibliographical references.Abstract --- p.iAcknowledgements --- p.iiiTable of Contents --- p.ivChapter Chapter I --- IntroductionChapter 1.1 --- Introduction --- p.1-1Chapter 1.2 --- Survey on Chinese Character Recognition --- p.1-4Chapter 1.3 --- Methodology Adopts in Our System --- p.1-7Chapter 1.4 --- Contributions and Organization of the Thesis --- p.1-11Chapter Chapter II --- Pre-processing and Stroke ExtractionChapter 2.1 --- Introduction --- p.2-1Chapter 2.2 --- Thinning --- p.2-1Chapter 2.2.1 --- Introduction to Thinning --- p.2-1Chapter 2.2.2 --- Proposed Thinning Algorithm Cater for Stroke Extraction --- p.2-6Chapter 2.2.3 --- Thinning Results --- p.2-9Chapter 2.3 --- Stroke Extraction --- p.2-13Chapter 2.3.1 --- Introduction to Stroke Extraction --- p.2-13Chapter 2.3.2 --- Proposed Stroke Extraction Method --- p.2-14Chapter 2.3.2.1 --- Fork point detection --- p.2-16Chapter 2.3.2.2 --- 8-connected fork point merging --- p.2-18Chapter 2.3.2.3 --- Sub-stroke extraction --- p.2-18Chapter 2.3.2.4 --- Fork point merging --- p.2-19Chapter 2.3.2.5 --- Sub-stroke connection --- p.2-24Chapter 2.3.3 --- Stroke Extraction Accuracy --- p.2-27Chapter 2.3.4 --- Corner Detection --- p.2-29Chapter 2.3.4.1 --- Introduction to Corner Detection --- p.2-29Chapter 2.3.4.2 --- Proposed Corner Detection Formulation --- p.2-30Chapter 2.4 --- Concluding Remarks --- p.2-33Chapter Chapter III --- Four Corner CodeChapter 3.1 --- Introduction --- p.3-1Chapter 3.2 --- Deletion of Hook Strokes --- p.3-3Chapter 3.3 --- Stroke Types Selection --- p.3-5Chapter 3.4 --- Probability Formulations of Stroke Types --- p.3-7Chapter 3.4.1 --- Simple Strokes --- p.3-7Chapter 3.4.2 --- Square --- p.3-8Chapter 3.4.3 --- Cross --- p.3-10Chapter 3.4.4 --- Upper Right Corner --- p.3-12Chapter 3.4.5 --- Lower Left Corner --- p.3-12Chapter 3.5 --- Corner Segments Extraction Procedure --- p.3-14Chapter 3.5.1 --- Corner Segment Probability --- p.3-21Chapter 3.5.2 --- Corner Segment Extraction --- p.3-23Chapter 3.6 4 --- C Codes Generation --- p.3-26Chapter 3.7 --- Parameters Determination --- p.3-29Chapter 3.8 --- Sensitivity Test --- p.3-31Chapter 3.9 --- Classification Rate --- p.3-32Chapter 3.10 --- Feedback by Corner Segments --- p.3-34Chapter 3.11 --- Classification Rate with Feedback by Corner Segment --- p.3-37Chapter 3.12 --- Reasons for Mis-classification --- p.3-38Chapter 3.13 --- Suggested Solution to the Mis-interpretation of Stroke Type --- p.3-41Chapter 3.14 --- Reduce Size of Candidate Set by No.of Input Segments --- p.3-43Chapter 3.15 --- Extension to Higher Order Code --- p.3-45Chapter 3.16 --- Concluding Remarks --- p.3-46Chapter Chapter IV --- RelaxationChapter 4.1 --- Introduction --- p.4-1Chapter 4.1.1 --- Introduction to Relaxation --- p.4-1Chapter 4.1.2 --- Formulation of Relaxation --- p.4-2Chapter 4.1.3 --- Survey on Chinese Character Recognition by using Relaxation --- p.4-5Chapter 4.2 --- Relaxation Formulations --- p.4-9Chapter 4.2.1 --- Definition of Neighbour Segments --- p.4-9Chapter 4.2.2 --- Formulation of Initial Probability Assignment --- p.4-12Chapter 4.2.3 --- Formulation of Compatibility Function --- p.4-14Chapter 4.2.4 --- Formulation of Support from Neighbours --- p.4-16Chapter 4.2.5 --- Stopping Criteria --- p.4-17Chapter 4.2.6 --- Distance Measures --- p.4-17Chapter 4.2.7 --- Parameters Determination --- p.4-21Chapter 4.3 --- Recognition Rate --- p.4-23Chapter 4.4 --- Reasons for Mis-recognition in Relaxation --- p.4-27Chapter 4.5 --- Introduction of No-label Class --- p.4-31Chapter 4.5.1 --- No-label Initial Probability --- p.4-31Chapter 4.5.2 --- No-label Compatibility Function --- p.4-32Chapter 4.5.3 --- Improvement by No-label Class --- p.4-33Chapter 4.6 --- Rate of Convergence --- p.4-35Chapter 4.6.1 --- Updating Formulae in Exponential Form --- p.4-38Chapter 4.7 --- Comparison with Yamamoto et al's Relaxation Method --- p.4-40Chapter 4.7.1 --- Formulations in Yamamoto et al's Relaxation Method --- p.4-40Chapter 4.7.2 --- Modifications in [YAMAM82] --- p.4-42Chapter 4.7.3 --- Performance Comparison with [YAMAM82] --- p.4-43Chapter 4.8 --- System Overall Recognition Rate --- p.4-45Chapter 4.9 --- Concluding Remarks --- p.4-48Chapter Chapter V --- Concluding RemarksChapter 5.1 --- Recapitulation and Conclusions --- p.5-1Chapter 5.2 --- Limitations in the System --- p.5-4Chapter 5.3 --- Suggestions for Further Developments --- p.5-6References --- p.R-1Appendix User's GuideChapter A .l --- System Functions --- p.A-1Chapter A.2 --- Platform and Compiler --- p.A-1Chapter A.3 --- File List --- p.A-2Chapter A.4 --- Directory --- p.A-3Chapter A.5 --- Description of Sub-routines --- p.A-3Chapter A.6 --- Data Structures and Header Files --- p.A-12Chapter A.7 --- Character File charfile Structure --- p.A-15Chapter A.8 --- Suggested Program to Implement the System --- p.A-1

    Bayesian hierarchical modeling for the forensic evaluation of handwritten documents

    Get PDF
    The analysis of handwritten evidence has been used widely in courts in the United States since the 1930s (Osborn, 1946). Traditional evaluations are conducted by trained forensic examiners. More recently, there has been a movement toward objective and probability-based evaluation of evidence, and a variety of governing bodies have made explicit calls for research to support the scientific underpinnings of the field (National Research Council, 2009; President\u27s Council of Advisors on Science and Technology (US), 2016; National Institutes of Standards and Technology). This body of work makes contributions to help satisfy those needs for the evaluation of handwritten documents. We develop a framework to evaluate a questioned writing sample against a finite set of genuine writing samples from known sources. Our approach is fully automated, reducing the opportunity for cognitive biases to enter the analysis pipeline through regular examiner intervention. Our methods are able to handle all writing styles together, and result in estimated probabilities of writership based on parametric modeling. We contribute open-source datasets, code, and algorithms. A document is prepared for the evaluation processed by first being scanned and stored as an image file. The image is processed and the text within is decomposed into a sequence of disjoint graphical structures. The graphs serve as the smallest unit of writing we will consider, and features extracted from them are used as data for modeling. Chapter 2 describes the image processing steps and introduces a distance measure for the graphs. The distance measure is used in a K-means clustering algorithm (Forgy, 1965; Lloyd, 1982; Gan and Ng, 2017), which results in a clustering template with 40 exemplar structures. The primary feature we extract from each graph is a cluster assignment. We do so by comparing each graph to the template and making assignments based on the exemplar to which each graph is most similar in structure. The cluster assignment feature is used for a writer identification exercise using a Bayesian hierarchical model on a small set of 27 writers. In Chapter 3 we incorporate new data sources and a larger number of writers in the clustering algorithm to produce an updated template. A mixture component is added to the hierarchical model and we explore the relationship between a writer\u27s estimated mixing parameter and their writing style. In Chapter 4 we expand the hierarchical model to include other graph-based features, in addition to cluster assignments. We incorporate an angular feature with support on the polar coordinate system into the hierarchical modeling framework using a circular probability density function. The new model is applied and tested in three applications
    corecore