6 research outputs found

    Rejection Strategies with Multiple Classifiers for Handwritten Character Recognition

    Full text link

    Confidence-Scoring Post-Processing for Off-Line Handwritten-Character Recognition Verification

    No full text
    We apply confidence-scoring techniques to verify the output of an off-line handwritten-character recognizer. We evaluate a variety of scoring functions, including likelihood ratios and estimated posterior probabilities of correctness, in a post-processing mode, to generate confidence scores. Using the post-processor in conjunction with a neural-netbased recognizer, on mixed-case letters, receiver-operatingcharacteristic (ROC) curves reveal that our post-processor is able to reject correctly 90% of recognizer errors while only falsely rejecting 18.6% of correctly-recognized letters. For isolated-digit recognition, we achieve a correct rejection rate of 95% while keeping false rejection down to 8.7%. 1

    Improving digital ink interpretation through expected type prediction and dynamic dispatch

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 67-70).Interpretation accuracy of current applications dependent on interpretation of handwritten "digital ink" can be improved by providing contextual information about an ink sample's expected type. This expected type, however, has to be known or provided a priori, and poses several challenges if unknown or ambiguous. We have developed a novel approach that uses a classic machine learning technique to predict this expected type from an ink sample. By extracting many relevant features from the ink, and performing generic dimensionality reduction, we can obtain a minimum prediction accuracy of 89% for experiments involving up to five different expected types. With this approach, we can create a "dynamic dispatch interpreter" by biasing interpretation differently according to the predicted expected types of the ink samples. When evaluated in the domain of introductory computer science, our interpreter achieves high interpretation accuracy (87%), an improvement from Microsoft's default interpreter (62%), and comparable with other previous interpreters (87-89%), which, unlike ours, require additional expected type information for each ink sample.by Kah Seng Tay.M.Eng

    Reliable pattern recognition system with novel semi-supervised learning approach

    Get PDF
    Over the past decade, there has been considerable progress in the design of statistical machine learning strategies, including Semi-Supervised Learning (SSL) approaches. However, researchers still have difficulties in applying most of these learning strategies when two or more classes overlap, and/or when each class has a bimodal/multimodal distribution. In this thesis, an efficient, robust, and reliable recognition system with a novel SSL scheme has been developed to overcome overlapping problems between two classes and bimodal distribution within each class. This system was based on the nature of category learning and recognition to enhance the system's performance in relevant applications. In the training procedure, besides the supervised learning strategy, the unsupervised learning approach was applied to retrieve the "extra information" that could not be obtained from the images themselves. This approach was very helpful for the classification between two confusing classes. In this SSL scheme, both the training data and the test data were utilized in the final classification. In this thesis, the design of a promising supervised learning model with advanced state-of-the-art technologies is firstly presented, and a novel rejection measurement for verification of rejected samples, namely Linear Discriminant Analysis Measurement (LDAM), is defined. Experiments on CENPARMI's Hindu-Arabic Handwritten Numeral Database, CENPARMI's Numerals Database, and NIST's Numerals Database were conducted in order to evaluate the efficiency of LDAM. Moreover, multiple verification modules, including a Writing Style Verification (WSV) module, have been developed according to four newly defined error categories. The error categorization was based on the different costs of misclassification. The WSV module has been developed by the unsupervised learning approach to automatically retrieve the person's writing styles so that the rejected samples can be classified and verified accordingly. As a result, errors on CENPARMI's Hindu-Arabic Handwritten Numeral Database (24,784 training samples, 6,199 testing samples) were reduced drastically from 397 to 59, and the final recognition rate of this HAHNR reached 99.05%, a significantly higher rate compared to other experiments on the same database. When the rejection option was applied on this database, the recognition rate, error rate, and reliability were 97.89%, 0.63%, and 99.28%, respectivel
    corecore