199 research outputs found
Handwritten Character Recognition of South Indian Scripts: A Review
Handwritten character recognition is always a frontier area of research in
the field of pattern recognition and image processing and there is a large
demand for OCR on hand written documents. Even though, sufficient studies have
performed in foreign scripts like Chinese, Japanese and Arabic characters, only
a very few work can be traced for handwritten character recognition of Indian
scripts especially for the South Indian scripts. This paper provides an
overview of offline handwritten character recognition in South Indian Scripts,
namely Malayalam, Tamil, Kannada and Telungu.Comment: Paper presented on the "National Conference on Indian Language
Computing", Kochi, February 19-20, 2011. 6 pages, 5 figure
Malayalam Handwritten Character Recognition using CNN Architecture
The process of encoding an input text image into a machine-readable format is called optical character recognition (OCR). The difference in characteristics of each language makes it difficult to develop a universal method that will have high accuracy for all languages. A method that produces good results for one language may not necessarily produce the same results for another language. OCR for printed characters is easier than handwritten characters because of the uniformity that exists in printed characters. While conventional methods find it hard to improve the existing methods, Convolutional Neural Networks (CNN) has shown drastic improvement in classification and recognition of other languages. However, there is no OCR model using CNN for Malayalam characters. Our proposed system uses a new CNN architecture for feature extraction and softmax layer for classification of characters. This eliminates manual designing of features that is used in the conventional methods. P-ARTS Kayyezhuthu dataset is used for training the CNN and an accuracy of 99.75% is obtained for the testing dataset meanwhile a collection of 40 real time input images yielded an accuracy of 95%
Performance Measure of Scanned Tamil Land Documents using Neural Network Approach
Recognition is process to find out the noisy or distorted image to make an accurate image. Classification and recognition technique which implemented on scanned Tamil land document. In the pre-processing stage, the given dataset is filtered by using median filter. After that, segmentation process is applied for every word image is splitted into character. Then, feature extraction is done by Gabor wavelet. For post processing stage, classification is process to check out the dataset which using neural network technique like supervised learning method or unsupervised learning method to find out the correct and Incorrect classification measure using confusion matrix. Hence, finally implemented Gabor wavelet technique to find the feature extraction and selection and then classification is done new pattern recognition technique using MATLAB. We also find out performance of plotting function like Training state, Regression, Gradient and validation.
DOI: 10.17762/ijritcc2321-8169.150519
TEXT CONTENT DEPENDENT WRITER IDENTIFICATION
Text content based personal Identification system is vital in resolving problem of identifying unknown document’s writer using a
set of handwritten samples from alleged known writers. Text written on paper document is usually captured as image by scanner
or camera for computer processing. The most challenging problem encounter in text image processing is extraction of robust
feature vector from a set of inconstant handwritten text images obtained from the same writer at different time. In this work new
feature extraction method is engaged to produce active text features for developing an effective personal identification system.
The feature formed feature vector which is fed as input data into classification algorithm based on Support Vector Machine
(SVM). Experiment was conducted to identify writers of query handwritten texts. Result show satisfactory performance of the
proposed system, it was able to identify writers of query handwritten texts
Spectral Analysis of Projection Histogram for Enhancing Close matching character Recognition in Malayalam
The success rates of Optical Character Recognition (OCR) systems for printed
Malayalam documents is quite impressive with the state of the art accuracy
levels in the range of 85-95% for various. However for real applications,
further enhancement of this accuracy levels are required. One of the bottle
necks in further enhancement of the accuracy is identified as close-matching
characters. In this paper, we delineate the close matching characters in
Malayalam and report the development of a specialised classifier for these
close-matching characters. The output of a state of the art of OCR is taken and
characters falling into the close-matching character set is further fed into
this specialised classifier for enhancing the accuracy. The classifier is based
on support vector machine algorithm and uses feature vectors derived out of
spectral coefficients of projection histogram signals of close-matching
characters
- …