11,903 research outputs found

    Multi-font Numerals Recognition for Urdu Script based Languages

    Get PDF
    International audienceHandwritten character recognition of Urdu script based languages is one of the most difficult task due to complexities of the script. Urdu script based languages has not received much attestation even this script is used more than 1/6th of the population. The complexities in the script makes more complicated the recognition process. The problem in handwritten numeral recognition is the shape similarity between handwritten numerals and dual style for Urdu. This paper presents a fuzzy rule base, HMM and Hybrid approaches for the recognition of numerals both Urdu and Arabic in unconstrained environment from both online and offline domain for online input. Basically offline domain is used for preprocessing i.e normalization, slant normalization. The proposed system is tested and provides accuracy of 97.1

    A fine-grained approach to scene text script identification

    Full text link
    This paper focuses on the problem of script identification in unconstrained scenarios. Script identification is an important prerequisite to recognition, and an indispensable condition for automatic text understanding systems designed for multi-language environments. Although widely studied for document images and handwritten documents, it remains an almost unexplored territory for scene text images. We detail a novel method for script identification in natural images that combines convolutional features and the Naive-Bayes Nearest Neighbor classifier. The proposed framework efficiently exploits the discriminative power of small stroke-parts, in a fine-grained classification framework. In addition, we propose a new public benchmark dataset for the evaluation of joint text detection and script identification in natural scenes. Experiments done in this new dataset demonstrate that the proposed method yields state of the art results, while it generalizes well to different datasets and variable number of scripts. The evidence provided shows that multi-lingual scene text recognition in the wild is a viable proposition. Source code of the proposed method is made available online

    Handwritten Script Recognition using DCT, Gabor Filter and Wavelet Features at Line Level

    Get PDF
    In a country like India where more number of scripts are in use, automatic identification of printed and handwritten script facilitates many important applications including sorting of document images and searching online archives of document images. In this paper, a multiple feature based approach is presented to identify the script type of the collection of handwritten documents. Eight popular Indian scripts are considered here. Features are extracted using Gabor filters, Discrete Cosine Transform, and Wavelets of Daubechies family. Experiments are performed to test the recognition accuracy of the proposed system at line level for bilingual scripts and later extended to trilingual scripts. We have obtained 100% recognition accuracy for bi-scripts at line level. The classification is done using k-nearest neighbour classifier

    Offline Handwritten Kannada Numerals Recognition

    Get PDF
    Handwritten Character Recognition (HCR) is one of the essential aspect in academic and production fields. The recognition system can be either online or offline. There is a large scope for character recognition on hand written papers. India is a multilingual and multi script country, where eighteen official scripts are accepted and have over hundred regional languages. Recognition of unconstrained hand written Indian scripts is difficult because of the presence of numerals, vowels, consonants, vowel modifiers and compound characters. In this paper, recognition of handwritten Kannada numeral characters is implemented and the different Wavelet features are used as feature extraction in this paper. The zonal densities of different region of an image have been extracted in the database. The database consists of 50 samples of each Kannada numeral character. For classification, the K-Nearest Neighbor method is used. Recognition accuracy of 88% has been achieved
    corecore