30 research outputs found

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject

    Segmentation of Nastaliq script for OCR

    Get PDF
    In this paper we have presented a novel segmentation technique for the implementation of an OCR (Optical Character Recognition) for printed Nastalique text, a calligraphic style of Urdu which uses the Arabic script for its writing.OCR for many of the world major languages have been developed and are being used but at present an OCR for Nastalique is not available and the published research on Nastalique OCR, Urdu OCR or even on any area of Urdu computing is almost non-existent, the reason being the challenges that the Nastalique style poses for its optical recognition. We used Matlab 7 for our experimentation the results are reported in this paper which are very encouraging

    Comparison of Template Matching Algorithm and Feature Extraction Algorithm in Sundanese Script Transliteration Application using Optical Character Recognition

    Get PDF
    The phenomenon that occurs in the area of West Java Province is that the people do not preserve their culture, especially regional literature, namely Sundanese script, in this digital era there is research on Sundanese script combined with applications using Feature Extraction algorithm, but there is no comparison with other algorithms and cannot recognize Sundanese numbers. Therefore, to develop the research a Sundanese script application was made with the implementation of OCR (Optical Character Recognition) using the Template Matching algorithm and the Feature Extraction algorithm that was modified with the pre-processing stages including using luminosity and thresholding algorithms, from the two algorithms compared to the accuracy and time values the process of recognizing digital writing and handwriting, the results of testing digital writing algorithm Matching algorithm has a value of 87% word recognition accuracy with 236 ms processing time and 97.6% character recognition accuracy with 227 ms processing time, Feature Extraction has 98% word recognition accuracy with 73.6 ms processing time and 100% character recognition accuracy with 66 ms processing time, for handwriting recognition in feature extraction character recognition has 83% accuracy and 75% word recognition , while template matching in character recognition has an accuracy of 70% and word recognition has an accuracy of 66%

    A Lexicon of Connected Components for Arabic Optical Text Recognition

    Get PDF
    Arabic is a cursive script that lacks the ease of character segmentation. Hence, we suggest a unit that is discrete in nature, viz. the connected component, for Arabic text recognition. A lexicon listing valid Arabic connected components is necessary to any system that is to use such unit. Here, we produce and analyze a comprehensive lexicon of connected components. A lexicon can be extracted from corpora or synthesized from morphemes. We follow both approaches and merge their results. Besides, generation of a lexicon of connected components encompasses extra tokenization and point-normalization steps to make the size of the lexicon tractable. We produce a lexicon of surface-words, reduce it into a lexicon of connected components, and finally into a lexicon of point normalized connected components. The lexicon of point normalized connected components contains 684,743 entries, showing a percent decrease of 97.17% from the word-lexicon
    corecore