600 research outputs found

    Online Handwritten Chinese/Japanese Character Recognition

    Get PDF

    Recognition of handwritten Chinese characters by combining regularization, Fisher's discriminant and distorted sample generation

    Get PDF
    Proceedings of the 10th International Conference on Document Analysis and Recognition, 2009, p. 1026–1030The problem of offline handwritten Chinese character recognition has been extensively studied by many researchers and very high recognition rates have been reported. In this paper, we propose to further boost the recognition rate by incorporating a distortion model that artificially generates a huge number of virtual training samples from existing ones. We achieve a record high recognition rate of 99.46% on the ETL-9B database. Traditionally, when the dimension of the feature vector is high and the number of training samples is not sufficient, the remedies are to (i) regularize the class covariance matrices in the discriminant functions, (ii) employ Fisher's dimension reduction technique to reduce the feature dimension, and (iii) generate a huge number of virtual training samples from existing ones. The second contribution of this paper is the investigation of the relative effectiveness of these three methods for boosting the recognition rate. © 2009 IEEE.published_or_final_versio

    Online Japanese Character Recognition Using Trajectory-Based Normalization and Direction Feature Extraction

    Get PDF
    http://www.suvisoft.comThis paper describes an online Japanese character recognition system using advanced techniques of pattern normalization and direction feature extraction. The normalization of point coordinates and the decomposition of direction elements are directly performed on online trajectory, and therefore, are computationally efficient. We compare one-dimensional and pseudo two-dimensional (pseudo 2D) normalization methods, as well as direction features from original pattern and from normalized pattern. In experiments on the TUAT HANDS databases, the pseudo 2D normalization methods yielded superior performance, while direction features from original pattern and from normalized pattern made little difference

    A Multi-Feature Selection Approach for Gender Identification of Handwriting based on Kernel Mutual Information

    Get PDF
    This paper presents a new flexible approach to predict the gender of the writers from their handwriting samples. Handwriting features like slant, curvature, line separation, chain code, character shapes, and more, can be extracted from different methods. Therefore, the multi-feature sets are irrelevant and redundant. The conflict of the features exists in the sets, which affects the accuracy of classification and the computing cost. This paper proposes an approach, named Kernel Mutual Information (KMI), that focuses on feature selection. The KMI approach can decrease redundancies and conflicts. In addition, it extracts an optimal subset of features from the writing samples produced by male and female writers. To ensure that KMI can apply the various features, this paper describes the handwriting segmentation and handwritten text recognition technology used. The classification is carried out using a Support Vector Machine (SVM) on two databases. The first database comes from the ICDAR 2013 competition on gender prediction, which provides the samples in both Arabic and English. The other database contains the Registration-Document-Form (RDF) database in Chinese. The proposed and compared methods were evaluated on both databases. Results from the methods highlight the importance of feature selection for gender prediction from handwriting

    Accuracy improvement in odia zip code recognition technique

    Get PDF
    Odia is a very popular language in India which is used by more than 45 million people worldwide, especially in the eastern region of India. The proposed recognition schemes for foreign languages such as Roman, Japanese, Chinese and Arabic can’t be applied directly for odia language because of the different structure of odia script. Hence, this report deals with the recognition of odia numerals with taking care of the varying style of handwriting. The main purpose is to apply the recognition scheme for zip code extraction and number plate recognition. Here, two methods “gradient and curvature method” and “box-method approach” are used to calculate the features of the preprocessed scanned image document. Features from both the methods are used to train the artificial neural network by taking a large no of samples from each numeral. Enough testing samples are used and results from both the features are compared. Principal component analysis has been applied to reduce the dimension of the feature vector so as to help further processing. The features from box-method of an unknown numeral are correlated with that of the standard numerals. While using neural networks, the average recognition accuracy using gradient and curvature features and box-method features are found to be 93.2 and 88.1 respectively
    corecore