264 research outputs found

    Unicode-driven Deep Learning Handwritten Telugu-to-English Character Recognition and Translation System

    Get PDF
    Telugu language is considered as fourth most used language in India especially in the regions of Andhra Pradesh, Telangana, Karnataka etc. In international recognized countries also, Telugu is widely growing spoken language. This language comprises of different dependent and independent vowels, consonants and digits. In this aspect, the enhancement of Telugu Handwritten Character Recognition (HCR) has not been propagated. HCR is a neural network technique of converting a documented image to edited text one which can be used for many other applications. This reduces time and effort without starting over from the beginning every time. In this work, a Unicode based Handwritten Character Recognition(U-HCR) is developed for translating the handwritten Telugu characters into English language. With the use of Centre of Gravity (CG) in our model we can easily divide a compound character into individual character with the help of Unicode values. For training this model, we have used both online and offline Telugu character datasets. To extract the features in the scanned image we used convolutional neural network along with Machine Learning classifiers like Random Forest and Support Vector Machine. Stochastic Gradient Descent (SGD), Root Mean Square Propagation (RMS-P) and Adaptative Moment Estimation (ADAM)optimizers are used in this work to enhance the performance of U-HCR and to reduce the loss function value. This loss value reduction can be possible with optimizers by using CNN. In both online and offline datasets, proposed model showed promising results by maintaining the accuracies with 90.28% for SGD, 96.97% for RMS-P and 93.57% for ADAM respectively

    Character Segmentation for Telugu Image Document using Multiple Histogram Projections

    Get PDF
    TEXT line segmentation is one of the major component of document image analysis. Text line segmentation is necessary to detect all text regions in the document image. In this paper we propose an algorithm based on multiple histogram projections using morphological operators to extract features of the image. Horizontal projection is performed on the text image, and then line segments are identified by the peaks in the horizontal projection. Threshold is applied to divide the text image into segments. False lines are eliminated using another threshold. Vertical histogram projections are used for the line segments and decomposed into words using threshold and further decomposed to characters. This approach provides best performance based on the experimental results such as Detection rate DR (98%) and Recognition Accuracy RA (98%)

    Development of Comprehensive Devnagari Numeral and Character Database for Offline Handwritten Character Recognition

    Get PDF
    In handwritten character recognition, benchmark database plays an important role in evaluating the performance of various algorithms and the results obtained by various researchers. In Devnagari script, there is lack of such official benchmark. This paper focuses on the generation of offline benchmark database for Devnagari handwritten numerals and characters. The present work generated 5137 and 20305 isolated samples for numeral and character database, respectively, from 750 writers of all ages, sex, education, and profession. The offline sample images are stored in TIFF image format as it occupies less memory. Also, the data is presented in binary level so that memory requirement is further reduced. It will facilitate research on handwriting recognition of Devnagari script through free access to the researchers.Comment: 5 pages, 8 figures, journal pape
    • …
    corecore