10 research outputs found

    Pashto Characters Recognition Using Multi-Class Enabled Support Vector Machine

    Get PDF
    During the last two decades significant work has been reported in the field of cursive language’s recognition especially, in the Arabic, the Urdu and the Persian languages. The unavailability of such work in the Pashto language is because of: the absence of a standard database and of significant research work that ultimately acts as a big barrier for the research community. The slight change in the Pashto characters’ shape is an additional challenge for researchers. This paper presents an efficient OCR system for the handwritten Pashto characters based on multi-class enabled support vector machine using manifold feature extraction techniques. These feature extraction techniques include, tools such as zoning feature extractor, discrete cosine transform, discrete wavelet transform, and Gabor filters and histogram of oriented gradients. A hybrid feature map is developed by combining the manifold feature maps. This research work is performed by developing a medium-sized dataset of handwritten Pashto characters that encapsulate 200 handwritten samples for each 44 characters in the Pashto language. Recognition results are generated for the proposed model based on a manifold and hybrid feature map. An overall accuracy rates of 63.30%, 65.13%, 68.55%, 68.28%, 67.02% and 83% are generated based on a zoning technique, HoGs, Gabor filter, DCT, DWT and hybrid feature maps respectively. Applicability of the proposed model is also tested by comparing its results with a convolution neural network model. The convolution neural network-based model generated an accuracy rate of 81.02% smaller than the multi-class support vector machine. The highest accuracy rate of 83% for the multi-class SVM model based on a hybrid feature map reflects the applicability of the proposed model.Qatar University [IRCC-2020-009]

    Evaluation of handwritten Urdu text by integration of MNIST dataset learning experience

    Full text link
    © 2019 IEEE. The similar nature of patterns may enhance the learning if the experience they attained during training is utilized to achieve maximum accuracy. This paper presents a novel way to exploit the transfer learning experience of similar patterns on handwritten Urdu text analysis. The MNIST pre-trained network is employed by transferring it's learning experience on Urdu Nastaliq Handwritten Dataset (UNHD) samples. The convolutional neural network is used for feature extraction. The experiments were performed using deep multidimensional long short term (MDLSTM) memory networks. The obtained result shows immaculate performance on number of experiments distinguished on the basis of handwritten complexity. The result of demonstrated experiments show that pre-trained network outperforms on subsequent target networks which enable them to focus on a particular feature learning. The conducted experiments presented astonishingly good accuracy on UNHD dataset

    Sub-sampling Approach for Unconstrained Arabic Scene Text Analysis by Implicit Segmentation based Deep Learning Classifier

    Get PDF
    The text extraction from the natural scene image is still a cumbersome task to perform. This paper presents a novel contribution and suggests the solution for cursive scene text analysis notably recognition of Arabic scene text appeared in the unconstrained environment. The hierarchical sub-sampling technique is adapted to investigate the potential through sub-sampling the window size of the given scene text sample. The deep learning architecture is presented by considering the complexity of the Arabic script. The conducted experiments present 96.81% accuracy at the character level. The comparison of the Arabic scene text with handwritten and printed data is outlined as well

    UTRNet: High-Resolution Urdu Text Recognition In Printed Documents

    Full text link
    In this paper, we propose a novel approach to address the challenges of printed Urdu text recognition using high-resolution, multi-scale semantic feature extraction. Our proposed UTRNet architecture, a hybrid CNN-RNN model, demonstrates state-of-the-art performance on benchmark datasets. To address the limitations of previous works, which struggle to generalize to the intricacies of the Urdu script and the lack of sufficient annotated real-world data, we have introduced the UTRSet-Real, a large-scale annotated real-world dataset comprising over 11,000 lines and UTRSet-Synth, a synthetic dataset with 20,000 lines closely resembling real-world and made corrections to the ground truth of the existing IIITH dataset, making it a more reliable resource for future research. We also provide UrduDoc, a benchmark dataset for Urdu text line detection in scanned documents. Additionally, we have developed an online tool for end-to-end Urdu OCR from printed documents by integrating UTRNet with a text detection model. Our work not only addresses the current limitations of Urdu OCR but also paves the way for future research in this area and facilitates the continued advancement of Urdu OCR technology. The project page with source code, datasets, annotations, trained models, and online tool is available at abdur75648.github.io/UTRNet.Comment: Accepted at The 17th International Conference on Document Analysis and Recognition (ICDAR 2023

    Issues & Challenges in Urdu OCR

    Get PDF
    Optical character recognition is a technique that is used to recognized printed and handwritten text into editable text format. There has been a lot of work done through this technology in identifying characters of different languages with variety of scripts. In which Latin scripts with isolated characters (non-cursive) like English are easy to recognize and significant advances have been made in the recognition; whereas, Arabic and its related cursive languages like Urdu have more complicated and intermingled scripts, are not much worked. This paper discusses a detail of various scripts of Urdu language also discuss issues and challenges regarding Urdu OCR. due to its cursive nature which include cursiveness, more characters dots, large set of characters for recognition, more base shape group characters, placement of dots, ambiguity between the characters and ligatures with very slight difference, context sensitive shapes, ligatures, noise, skew and fonts in Urdu OCR. This paper provides a better understanding toward all the possible engendering dilemmas related to Urdu character recognition

    Comparative analysis of Tesseract and Google Cloud Vision for Thai vehicle registration certificate

    Get PDF
    Optical character recognition (OCR) is a technology to digitize a paper-based document to digital form. This research studies the extraction of the characters from a Thai vehicle registration certificate via a Google Cloud Vision API and a Tesseract OCR. The recognition performance of both OCR APIs is also examined. The 84 color image files comprised three image sizes/resolutions and five image characteristics. For suitable image type comparison, the greyscale and binary image are converted from color images. Furthermore, the three pre-processing techniques, sharpening, contrast adjustment, and brightness adjustment, are also applied to enhance the quality of image before applying the two OCR APIs. The recognition performance was evaluated in terms of accuracy and readability. The results showed that the Google Cloud Vision API works well for the Thai vehicle registration certificate with an accuracy of 84.43%, whereas the Tesseract OCR showed an accuracy of 47.02%. The highest accuracy came from the color image with 1024×768 px, 300dpi, and using sharpening and brightness adjustment as pre-processing techniques. In terms of readability, the Google Cloud Vision API has more readability than the Tesseract. The proposed conditions facilitate the possibility of the implementation for Thai vehicle registration certificate recognition system

    Recognizing Handwriting Styles in a Historical Scanned Document Using Unsupervised Fuzzy Clustering

    Full text link
    The forensic attribution of the handwriting in a digitized document to multiple scribes is a challenging problem of high dimensionality. Unique handwriting styles may be dissimilar in a blend of several factors including character size, stroke width, loops, ductus, slant angles, and cursive ligatures. Previous work on labeled data with Hidden Markov models, support vector machines, and semi-supervised recurrent neural networks have provided moderate to high success. In this study, we successfully detect hand shifts in a historical manuscript through fuzzy soft clustering in combination with linear principal component analysis. This advance demonstrates the successful deployment of unsupervised methods for writer attribution of historical documents and forensic document analysis.Comment: 26 pages in total, 5 figures and 2 table

    Zoning features and 2DLSTM for urdu text-line recognition

    Full text link
    © 2016 The Authors. Published by Elsevier B.V. Recognition of Urdu cursive script is a challenging task due to the implicit complexities associated with it. The performance of a recognition system is immensely dependent on extracted features. There are various features extraction approaches proposed in recent years. Among many, an approach based on zoning features proved to be efficient and popular. Such zoning features represent significant information with low complexity and high speed. In this paper, we used zoning features for the classification of Urdu Nasta\u27liq text lines, with a combination of 2-Dimensional Long Short Term Memory networks (2DLSTM) as learning classifier. The proposed model is evaluated on publicly available UPTI dataset and character recognition rate of 93.39% is obtained
    corecore