356 research outputs found

    Handwritten Character Recognition of South Indian Scripts: A Review

    Full text link
    Handwritten character recognition is always a frontier area of research in the field of pattern recognition and image processing and there is a large demand for OCR on hand written documents. Even though, sufficient studies have performed in foreign scripts like Chinese, Japanese and Arabic characters, only a very few work can be traced for handwritten character recognition of Indian scripts especially for the South Indian scripts. This paper provides an overview of offline handwritten character recognition in South Indian Scripts, namely Malayalam, Tamil, Kannada and Telungu.Comment: Paper presented on the "National Conference on Indian Language Computing", Kochi, February 19-20, 2011. 6 pages, 5 figure

    A study of feature extraction for Arabic calligraphy characters recognition

    Get PDF
    Optical character recognition (OCR) is one of the widely used pattern recognition systems. However, the research on ancient Arabic writing recognition has suffered from a lack of interest for decades, despite the availability of thousands of historical documents. One of the reasons for this lack of interest is the absence of a standard dataset, which is fundamental for building and evaluating an OCR system. In 2022, we published a database of ancient Arabic words as the only public dataset of characters written in Al-Mojawhar Moroccan calligraphy. Therefore, such a database needs to be studied and evaluated. In this paper, we explored the proposed database and investigated the recognition of Al-Mojawhar Arabic characters. We studied feature extraction by using the most popular descriptors used in Arabic OCR. The studied descriptors were associated with different machine learning classifiers to build recognition models and verify their performance. In order to compare the learned and handcrafted features on the proposed dataset, we proposed a deep convolutional neural network for character recognition. Regarding the complexity of the character shapes, the results obtained were very promising, especially by using the convolutional neural network model, which gave the highest accuracy score

    A new hybrid convolutional neural network and eXtreme gradient boosting classifier for recognizing handwritten Ethiopian characters

    Get PDF
    Handwritten character recognition has been profoundly studied for many years in the field of pattern recognition. Due to its vast practical applications and financial implications, handwritten character recognition is still an important research area. In this research, the Handwritten Ethiopian Character Recognition (HECR) dataset has been prepared to train the model. The images in the HECR dataset were organized with more than one color pen RGB main spaces that have been size normalized to 28 × 28 pixels. The dataset is a combination of scripts (Fidel in Ethiopia), numerical representations, punctuations, tonal symbols, combining symbols, and special characters. These scripts have been used to write ancient histories, science, and arts of Ethiopia and Eritrea. In this study, a hybrid model of two super classifiers: Convolutional Neural Network (CNN) and eXtreme Gradient Boosting (XGBoost) is proposed for classification. In this integrated model, CNN works as a trainable automatic feature extractor from the raw images and XGBoost takes the extracted features as an input for recognition and classification. The output error rates of the hybrid model and CNN with a fully connected layer are compared. A 0.4630 and 0.1612 error rates are achieved in classifying the handwritten testing dataset images, respectively. Thus XGBoost as a classifier performs a better result than the traditional fully connected layer
    • …
    corecore