199 research outputs found

    Handwritten Character Recognition of South Indian Scripts: A Review

    Full text link
    Handwritten character recognition is always a frontier area of research in the field of pattern recognition and image processing and there is a large demand for OCR on hand written documents. Even though, sufficient studies have performed in foreign scripts like Chinese, Japanese and Arabic characters, only a very few work can be traced for handwritten character recognition of Indian scripts especially for the South Indian scripts. This paper provides an overview of offline handwritten character recognition in South Indian Scripts, namely Malayalam, Tamil, Kannada and Telungu.Comment: Paper presented on the "National Conference on Indian Language Computing", Kochi, February 19-20, 2011. 6 pages, 5 figure

    Malayalam Handwritten Character Recognition using CNN Architecture

    Get PDF
    The process of encoding an input text image into a machine-readable format is called optical character recognition (OCR). The difference in characteristics of each language makes it difficult to develop a universal method that will have high accuracy for all languages. A method that produces good results for one language may not necessarily produce the same results for another language. OCR for printed characters is easier than handwritten characters because of the uniformity that exists in printed characters. While conventional methods find it hard to improve the existing methods, Convolutional Neural Networks (CNN) has shown drastic improvement in classification and recognition of other languages. However, there is no OCR model using CNN for Malayalam characters. Our proposed system uses a new CNN architecture for feature extraction and softmax layer for classification of characters. This eliminates manual designing of features that is used in the conventional methods. P-ARTS Kayyezhuthu dataset is used for training the CNN and an accuracy of 99.75% is obtained for the testing dataset meanwhile a collection of 40 real time input images yielded an accuracy of 95%

    Performance Measure of Scanned Tamil Land Documents using Neural Network Approach

    Get PDF
    Recognition is process to find out the noisy or distorted image to make an accurate image. Classification and recognition technique which implemented on scanned Tamil land document. In the pre-processing stage, the given dataset is filtered by using median filter. After that, segmentation process is applied for every word image is splitted into character. Then, feature extraction is done by Gabor wavelet. For post processing stage, classification is process to check out the dataset which using neural network technique like supervised learning method or unsupervised learning method to find out the correct and Incorrect classification measure using confusion matrix. Hence, finally implemented Gabor wavelet technique to find the feature extraction and selection and then classification is done new pattern recognition technique using MATLAB. We also find out performance of plotting function like Training state, Regression, Gradient and validation. DOI: 10.17762/ijritcc2321-8169.150519

    TEXT CONTENT DEPENDENT WRITER IDENTIFICATION

    Get PDF
    Text content based personal Identification system is vital in resolving problem of identifying unknown document’s writer using a set of handwritten samples from alleged known writers. Text written on paper document is usually captured as image by scanner or camera for computer processing. The most challenging problem encounter in text image processing is extraction of robust feature vector from a set of inconstant handwritten text images obtained from the same writer at different time. In this work new feature extraction method is engaged to produce active text features for developing an effective personal identification system. The feature formed feature vector which is fed as input data into classification algorithm based on Support Vector Machine (SVM). Experiment was conducted to identify writers of query handwritten texts. Result show satisfactory performance of the proposed system, it was able to identify writers of query handwritten texts

    Spectral Analysis of Projection Histogram for Enhancing Close matching character Recognition in Malayalam

    Full text link
    The success rates of Optical Character Recognition (OCR) systems for printed Malayalam documents is quite impressive with the state of the art accuracy levels in the range of 85-95% for various. However for real applications, further enhancement of this accuracy levels are required. One of the bottle necks in further enhancement of the accuracy is identified as close-matching characters. In this paper, we delineate the close matching characters in Malayalam and report the development of a specialised classifier for these close-matching characters. The output of a state of the art of OCR is taken and characters falling into the close-matching character set is further fed into this specialised classifier for enhancing the accuracy. The classifier is based on support vector machine algorithm and uses feature vectors derived out of spectral coefficients of projection histogram signals of close-matching characters
    • …
    corecore