21 research outputs found

    Handwritten digit recognition of Indian scripts: a cascade of distances approach

    Get PDF

    Hand Written Odia Character Recognition

    Get PDF
    The world is fast moving towards digitalization. In the age of super-fast computational capabilities, everything has to be made digitalized so as to make the computer understand and thereby process the given information. Optical character recognition is a method by which the computer is made to learn, understand and interpret the languages used and written by the human beings. It provides us a whole new way by which computer can interact with human beings, in their own languages. Hence OCR has been a topic of interest for researchers all around the globe in the past decade and research paper involving OCR is increasing day by day. It is seen that efficient algorithms have increased the speed and accuracy of character recognition. A substantial amount of work has been done on foreign languages such as English , Chinese etc. but very few paper are there for Indian languages baring a few for Hindi and Bengali. Hence our research work was directed towards development of a novel algorithm for Odia character recognition. Odia is one of the eighteen languages recognized by the Indian constituency. It is also one of the oldest languages and is spoken by more than 44 million people in the state of Odisha. Recognition of this particular language is difficult because of a number of similar looking characters and the presence of complex characters. A novel technique is proposed and implemented for the feature extraction method where by a set of 81 feature vectors are extracted to uniquely identify a particular character. The recognition is based on finding the minimum error by implementing the Euclidean distance method. After the implementation of the above technique, accuracy was found to be about 70 % which is much better than many techniques earlier available

    Handwritten Character Recognition of a Vernacular Language: The Odia Script

    Get PDF
    Optical Character Recognition, i.e., OCR taking into account the principle of applying electronic or mechanical translation of images from printed, manually written or typewritten sources to editable version. As of late, OCR technology has been utilized in most of the industries for better management of various documents. OCR helps to edit the text, allow us to search for a word or phrase, and store it more compactly in the computer memory for future use and moreover, it can be processed by other applications. In India, a couple of organizations have designed OCR for some mainstream Indic dialects, for example, Devanagari, Hindi, Bangla and to some extent Telugu, Tamil, Gurmukhi, Odia, etc. However, it has been observed that the progress for Odia script recognition is quite less when contrasted with different dialects. Any recognition process works on some nearby standard databases. Till now, no such standard database available in the literature for Odia script. Apart from the existing standard databases for other Indic languages, in this thesis, we have designed databases on handwritten Odia Digit, and character for the simulation of the proposed schemes. In this thesis, four schemes have been suggested, one for the recognition of Odia digit and other three for atomic Odia character. Various issues of handwritten character recognition have been examined including feature extraction, the grouping of samples based on some characteristics, and designing classifiers. Also, different features such as statistical as well as structural of a character have been studied. It is not necessary that the character written by a person next time would always be of same shape and stroke. Hence, variability in the personal writing of different individual makes the character recognition quite challenging. Standard classifiers have been utilized for the recognition of Odia character set. An array of Gabor filters has been employed for recognition of Odia digits. In this regard, each image is divided into four blocks of equal size. Gabor filters with various scales and orientations have been applied to these sub-images keeping other filter parameters constant. The average energy is computed for each transformed image to obtain a feature vector for each digit. Further, a Back Propagation Neural Network (BPNN) has been employed to classify the samples taking the feature vector as input. In addition, the proposed scheme has also been tested on standard digit databases like MNIST and USPS. Toward the end of this part, an application has been intended to evaluate simple arithmetic equation. viii A multi-resolution scheme has been suggested to extract features from Odia atomic character and recognize them using the back propagation neural network. It has been observed that few Odia characters have a vertical line present toward the end. It helps in dividing the whole dataset into two subgroups, in particular, Group I and Group II such that all characters in Group I have a vertical line and rest are in Group II. The two class classification problem has been tackled by a single layer perceptron. Besides, the two-dimensional Discrete Orthogonal S-Transform (DOST) coefficients are extracted from images of each group, subsequently, Principal Component Analysis (PCA) has been applied to find significant features. For each group, a separate BPNN classifier is utilized to recognize the character set

    Development of Features for Recognition of Handwritten Odia Characters

    Get PDF
    In this thesis, we propose four different schemes for recognition of handwritten atomic Odia characters which includes forty seven alphabets and ten numerals. Odia is the mother tongue of the state of Odisha in the republic of India. Optical character recognition (OCR) for many languages is quite matured and OCR systems are already available in industry standard but, for the Odia language OCR is still a challenging task. Further, the features described for other languages can’t be directly utilized for Odia character recognition for both printed and handwritten text. Thus, the prime thrust has been made to propose features and utilize a classifier to derive a significant recognition accuracy. Due to the non-availability of a handwritten Odia database for validation of the proposed schemes, we have collected samples from individuals to generate a database of large size through a digital note maker. The database consists of a total samples of 17, 100 (150 × 2 × 57) collected from 150 individuals at two different times for 57 characters. This database has been named Odia handwritten character set version 1.0 (OHCS v1.0) and is made available in http://nitrkl.ac.in/Academic/Academic_Centers/Centre_For_Computer_Vision.aspx for the use of researchers. The first scheme divides the contour of each character into thirty segments. Taking the centroid of the character as base point, three primary features length, angle, and chord-to-arc-ratio are extracted from each segment. Thus, there are 30 feature values for each primary attribute and a total of 90 feature points. A back propagation neural network has been employed for the recognition and performance comparisons are made with competent schemes. The second contribution falls in the line of feature reduction of the primary features derived in the earlier contribution. A fuzzy inference system has been employed to generate an aggregated feature vector of size 30 from 90 feature points which represent the most significant features for each character. For recognition, a six-state hidden Markov model (HMM) is employed for each character and as a consequence we have fifty-seven ergodic HMMs with six-states each. An accuracy of 84.5% has been achieved on our dataset. The third contribution involves selection of evidence which are the most informative local shape contour features. A dedicated distance metric namely, far_count is used in computation of the information gain values for possible segments of different lengths that are extracted from whole shape contour of a character. The segment, with highest information gain value is treated as the evidence and mapped to the corresponding class. An evidence dictionary is developed out of these evidence from all classes of characters and is used for testing purpose. An overall testing accuracy rate of 88% is obtained. The final contribution deals with the development of a hybrid feature derived from discrete wavelet transform (DWT) and discrete cosine transform (DCT). Experimentally it has been observed that a 3-level DWT decomposition with 72 DCT coefficients from each high-frequency components as features gives a testing accuracy of 86% in a neural classifier. The suggested features are studied in isolation and extensive simulations has been carried out along with other existing schemes using the same data set. Further, to study generalization behavior of proposed schemes, they are applied on English and Bangla handwritten datasets. The performance parameters like recognition rate and misclassification rate are computed and compared. Further, as we progress from one contribution to the other, the proposed scheme is compared with the earlier proposed schemes

    DTW-Radon-based Shape Descriptor for Pattern Recognition

    Get PDF
    International audienceIn this paper, we present a pattern recognition method that uses dynamic programming (DP) for the alignment of Radon features. The key characteristic of the method is to use dynamic time warping (DTW) to match corresponding pairs of the Radon features for all possible projections. Thanks to DTW, we avoid compressing the feature matrix into a single vector which would otherwise miss information. To reduce the possible number of matchings, we rely on a initial normalisation based on the pattern orientation. A comprehensive study is made using major state-of-the-art shape descriptors over several public datasets of shapes such as graphical symbols (both printed and hand-drawn), handwritten characters and footwear prints. In all tests, the method proves its generic behaviour by providing better recognition performance. Overall, we validate that our method is robust to deformed shape due to distortion, degradation and occlusion

    Hand-written character recognition using artificial neural network

    Get PDF
    In todays’ world advancement in sophisticated scientific techniques is pushing further the limits of human outreach in various fields of technology. One such field is the field of character recognition commonly known as OCR (Optical Character Recognition). In this fast paced world there is an immense urge for the digitalisation of printed documents and documentation of information directly in digital form. And there is still some gap in this area even today. OCR techniques and their continuous improvisation from time to time is trying to fill this gap. This project is about devising an algorithm for recognition of hand written characters leaving aside types of OCR that deals with recognition of computer or typewriter printed characters. A novel technique is proposed using Artificial Neural Network including the schemes of feature extraction of the characters and implemented. The persistency in recognition of characters by the AN network was found to be more than 90% of times

    An empirical study on writer identification and verification from intra-variable individual handwriting

    Full text link
    © 2013 IEEE. The handwriting of a person may vary substantially with factors, such as mood, time, space, writing speed, writing medium/tool, writing a topic, and so on. It becomes challenging to perform automated writer verification/identification on a particular set of handwritten patterns (e.g., speedy handwriting) of an individual, especially when the system is trained using a different set of writing patterns (e.g., normal speed) of that same person. However, it would be interesting to experimentally analyze if there exists any implicit characteristic of individuality which is insensitive to high intra-variable handwriting. In this paper, we study some handcrafted features and auto-derived features extracted from intra-variable writing. Here, we work on writer identification/verification from highly intra-variable offline Bengali writing. To this end, we use various models mainly based on handcrafted features with support vector machine and features auto-derived by the convolutional network. For experimentation, we have generated two handwritten databases from two different sets of 100 writers and enlarged the dataset by a data-augmentation technique. We have obtained some interesting results
    corecore