21 research outputs found

    Chinese information processing

    Full text link
    A survey of the field of Chinese information processing is provided. It covers the following areas: the Chinese writing system, several popular Chinese encoding schemes and code conversions, Chinese keyboard entry methods, Chinese fonts, Chinese operating systems, basic Chinese computing techniques and applications

    Four cornered code based Chinese character recognition system.

    Get PDF
    by Tham Yiu-Man.Thesis (M.Phil.)--Chinese University of Hong Kong, 1993.Includes bibliographical references.Abstract --- p.iAcknowledgements --- p.iiiTable of Contents --- p.ivChapter Chapter I --- IntroductionChapter 1.1 --- Introduction --- p.1-1Chapter 1.2 --- Survey on Chinese Character Recognition --- p.1-4Chapter 1.3 --- Methodology Adopts in Our System --- p.1-7Chapter 1.4 --- Contributions and Organization of the Thesis --- p.1-11Chapter Chapter II --- Pre-processing and Stroke ExtractionChapter 2.1 --- Introduction --- p.2-1Chapter 2.2 --- Thinning --- p.2-1Chapter 2.2.1 --- Introduction to Thinning --- p.2-1Chapter 2.2.2 --- Proposed Thinning Algorithm Cater for Stroke Extraction --- p.2-6Chapter 2.2.3 --- Thinning Results --- p.2-9Chapter 2.3 --- Stroke Extraction --- p.2-13Chapter 2.3.1 --- Introduction to Stroke Extraction --- p.2-13Chapter 2.3.2 --- Proposed Stroke Extraction Method --- p.2-14Chapter 2.3.2.1 --- Fork point detection --- p.2-16Chapter 2.3.2.2 --- 8-connected fork point merging --- p.2-18Chapter 2.3.2.3 --- Sub-stroke extraction --- p.2-18Chapter 2.3.2.4 --- Fork point merging --- p.2-19Chapter 2.3.2.5 --- Sub-stroke connection --- p.2-24Chapter 2.3.3 --- Stroke Extraction Accuracy --- p.2-27Chapter 2.3.4 --- Corner Detection --- p.2-29Chapter 2.3.4.1 --- Introduction to Corner Detection --- p.2-29Chapter 2.3.4.2 --- Proposed Corner Detection Formulation --- p.2-30Chapter 2.4 --- Concluding Remarks --- p.2-33Chapter Chapter III --- Four Corner CodeChapter 3.1 --- Introduction --- p.3-1Chapter 3.2 --- Deletion of Hook Strokes --- p.3-3Chapter 3.3 --- Stroke Types Selection --- p.3-5Chapter 3.4 --- Probability Formulations of Stroke Types --- p.3-7Chapter 3.4.1 --- Simple Strokes --- p.3-7Chapter 3.4.2 --- Square --- p.3-8Chapter 3.4.3 --- Cross --- p.3-10Chapter 3.4.4 --- Upper Right Corner --- p.3-12Chapter 3.4.5 --- Lower Left Corner --- p.3-12Chapter 3.5 --- Corner Segments Extraction Procedure --- p.3-14Chapter 3.5.1 --- Corner Segment Probability --- p.3-21Chapter 3.5.2 --- Corner Segment Extraction --- p.3-23Chapter 3.6 4 --- C Codes Generation --- p.3-26Chapter 3.7 --- Parameters Determination --- p.3-29Chapter 3.8 --- Sensitivity Test --- p.3-31Chapter 3.9 --- Classification Rate --- p.3-32Chapter 3.10 --- Feedback by Corner Segments --- p.3-34Chapter 3.11 --- Classification Rate with Feedback by Corner Segment --- p.3-37Chapter 3.12 --- Reasons for Mis-classification --- p.3-38Chapter 3.13 --- Suggested Solution to the Mis-interpretation of Stroke Type --- p.3-41Chapter 3.14 --- Reduce Size of Candidate Set by No.of Input Segments --- p.3-43Chapter 3.15 --- Extension to Higher Order Code --- p.3-45Chapter 3.16 --- Concluding Remarks --- p.3-46Chapter Chapter IV --- RelaxationChapter 4.1 --- Introduction --- p.4-1Chapter 4.1.1 --- Introduction to Relaxation --- p.4-1Chapter 4.1.2 --- Formulation of Relaxation --- p.4-2Chapter 4.1.3 --- Survey on Chinese Character Recognition by using Relaxation --- p.4-5Chapter 4.2 --- Relaxation Formulations --- p.4-9Chapter 4.2.1 --- Definition of Neighbour Segments --- p.4-9Chapter 4.2.2 --- Formulation of Initial Probability Assignment --- p.4-12Chapter 4.2.3 --- Formulation of Compatibility Function --- p.4-14Chapter 4.2.4 --- Formulation of Support from Neighbours --- p.4-16Chapter 4.2.5 --- Stopping Criteria --- p.4-17Chapter 4.2.6 --- Distance Measures --- p.4-17Chapter 4.2.7 --- Parameters Determination --- p.4-21Chapter 4.3 --- Recognition Rate --- p.4-23Chapter 4.4 --- Reasons for Mis-recognition in Relaxation --- p.4-27Chapter 4.5 --- Introduction of No-label Class --- p.4-31Chapter 4.5.1 --- No-label Initial Probability --- p.4-31Chapter 4.5.2 --- No-label Compatibility Function --- p.4-32Chapter 4.5.3 --- Improvement by No-label Class --- p.4-33Chapter 4.6 --- Rate of Convergence --- p.4-35Chapter 4.6.1 --- Updating Formulae in Exponential Form --- p.4-38Chapter 4.7 --- Comparison with Yamamoto et al's Relaxation Method --- p.4-40Chapter 4.7.1 --- Formulations in Yamamoto et al's Relaxation Method --- p.4-40Chapter 4.7.2 --- Modifications in [YAMAM82] --- p.4-42Chapter 4.7.3 --- Performance Comparison with [YAMAM82] --- p.4-43Chapter 4.8 --- System Overall Recognition Rate --- p.4-45Chapter 4.9 --- Concluding Remarks --- p.4-48Chapter Chapter V --- Concluding RemarksChapter 5.1 --- Recapitulation and Conclusions --- p.5-1Chapter 5.2 --- Limitations in the System --- p.5-4Chapter 5.3 --- Suggestions for Further Developments --- p.5-6References --- p.R-1Appendix User's GuideChapter A .l --- System Functions --- p.A-1Chapter A.2 --- Platform and Compiler --- p.A-1Chapter A.3 --- File List --- p.A-2Chapter A.4 --- Directory --- p.A-3Chapter A.5 --- Description of Sub-routines --- p.A-3Chapter A.6 --- Data Structures and Header Files --- p.A-12Chapter A.7 --- Character File charfile Structure --- p.A-15Chapter A.8 --- Suggested Program to Implement the System --- p.A-1

    Feature Extraction Methods for Character Recognition

    Get PDF
    Not Include

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject

    Handwriting style classification

    Get PDF
    This paper describes an independent handwriting style classifier that has been designed to select the best recognizer for a given style of writing. For this purpose a definition of handwriting legibility has been defined and a method implemented that can predict this legibility. The technique consists of two phases. In the feature-extraction phase, a set of 36 features is extracted from the image contour. In the classification phase, two nonparametric classification techniques are applied to the extracted features in order to compare their effectiveness in classifying words into legible, illegible, and middle classes. In the first method, a multiple discriminant analysis (MDA) is used to transform the space of extracted features (36 dimensions) into an optimal discriminant space for a nearest mean based classifier. In the second method, a probabilistic neural network (PNN) based on the Bayes strategy and nonparametric estimation of probability density function is used. The experimental results show that the PNN method gives superior classification results when compared with the MDA method. For the legible, illegible, and middle handwriting the method provides 86.5% (legible/illegible), 65.5% (legible/middle), and 90.5% (middle/illegible) correct classification for two classes. For the three-class legibility classification the rate of correct classification is 67.33% using a PNN classifier

    A novel approach to handwritten character recognition

    Get PDF
    A number of new techniques and approaches for off-line handwritten character recognition are presented which individually make significant advancements in the field. First. an outline-based vectorization algorithm is described which gives improved accuracy in producing vector representations of the pen strokes used to draw characters. Later. Vectorization and other types of preprocessing are criticized and an approach to recognition is suggested which avoids separate preprocessing stages by incorporating them into later stages. Apart from the increased speed of this approach. it allows more effective alteration of the character images since more is known about them at the later stages. It also allows the possibility of alterations being corrected if they are initially detrimental to recognition. A new feature measurement. the Radial Distance/Sector Area feature. is presented which is highly robust. tolerant to noise. distortion and style variation. and gives high accuracy results when used for training and testing in a statistical or neural classifier. A very powerful classifier is therefore obtained for recognizing correctly segmented characters. The segmentation task is explored in a simple system of integrated over-segmentation. Character classification and approximate dictionary checking. This can be extended to a full system for handprinted word recognition. In addition to the advancements made by these methods. a powerful new approach to handwritten character recognition is proposed as a direction for future research. This proposal combines the ideas and techniques developed in this thesis in a hierarchical network of classifier modules to achieve context-sensitive. off-line recognition of handwritten text. A new type of "intelligent" feedback is used to direct the search to contextually sensible classifications. A powerful adaptive segmentation system is proposed which. when used as the bottom layer in the hierarchical network. allows initially incorrect segmentations to be adjusted according to the hypotheses of the higher level context modules

    Use of prior knowledge in classification of similar and structured objects

    Get PDF
    Statistical machine learning has achieved great success in many fields in the last few decades. However, there remain classification problems that computers still struggle to match human performance. Many such problems share the same properties---large within class variability and complex structure in the examples, which is often true for real world objects. This does not mean lack of information for classification in the examples. On the contrary, there is still a clear pattern in the examples, but hidden behind a many-way covariance structure such that useful information is too dilute for conventional statistical machine learners to pick up. However, if we can exploit the structural nature of the objects and concentrate information about the classification, the problem can become much easier. In this dissertation we propose a framework using prior knowledge about modeling the structures in the examples to concentrate information for classification. The framework is instantiated to the task of classifying pairs of similar offline handwritten Chinese characters. We empirically demonstrate that our proposed framework indeed concentrates useful information for classification and makes the classification problem easier for statistical learning. Our approach advances the state of the art both in offline handwritten character recognition and in machine learning

    A novel approach to handwritten character recognition

    Get PDF
    A number of new techniques and approaches for off-line handwritten character recognition are presented which individually make significant advancements in the field. First. an outline-based vectorization algorithm is described which gives improved accuracy in producing vector representations of the pen strokes used to draw characters. Later. Vectorization and other types of preprocessing are criticized and an approach to recognition is suggested which avoids separate preprocessing stages by incorporating them into later stages. Apart from the increased speed of this approach. it allows more effective alteration of the character images since more is known about them at the later stages. It also allows the possibility of alterations being corrected if they are initially detrimental to recognition. A new feature measurement. the Radial Distance/Sector Area feature. is presented which is highly robust. tolerant to noise. distortion and style variation. and gives high accuracy results when used for training and testing in a statistical or neural classifier. A very powerful classifier is therefore obtained for recognizing correctly segmented characters. The segmentation task is explored in a simple system of integrated over-segmentation. Character classification and approximate dictionary checking. This can be extended to a full system for handprinted word recognition. In addition to the advancements made by these methods. a powerful new approach to handwritten character recognition is proposed as a direction for future research. This proposal combines the ideas and techniques developed in this thesis in a hierarchical network of classifier modules to achieve context-sensitive. off-line recognition of handwritten text. A new type of "intelligent" feedback is used to direct the search to contextually sensible classifications. A powerful adaptive segmentation system is proposed which. when used as the bottom layer in the hierarchical network. allows initially incorrect segmentations to be adjusted according to the hypotheses of the higher level context modules
    corecore