103 research outputs found

    Automatic handwriter identification using advanced machine learning

    Get PDF
    Handwriter identification a challenging problem especially for forensic investigation. This topic has received significant attention from the research community and several handwriter identification systems were developed for various applications including forensic science, document analysis and investigation of the historical documents. This work is part of an investigation to develop new tools and methods for Arabic palaeography, which is is the study of handwritten material, particularly ancient manuscripts with missing writers, dates, and/or places. In particular, the main aim of this research project is to investigate and develop new techniques and algorithms for the classification and analysis of ancient handwritten documents to support palaeographic studies. Three contributions were proposed in this research. The first is concerned with the development of a text line extraction algorithm on colour and greyscale historical manuscripts. The idea uses a modified bilateral filtering approach to adaptively smooth the images while still preserving the edges through a nonlinear combination of neighboring image values. The proposed algorithm aims to compute a median and a separating seam and has been validated to deal with both greyscale and colour historical documents using different datasets. The results obtained suggest that our proposed technique yields attractive results when compared against a few similar algorithms. The second contribution proposes to deploy a combination of Oriented Basic Image features and the concept of graphemes codebook in order to improve the recognition performances. The proposed algorithm is capable to effectively extract the most distinguishing handwriter’s patterns. The idea consists of judiciously combining a multiscale feature extraction with the concept of grapheme to allow for the extraction of several discriminating features such as handwriting curvature, direction, wrinkliness and various edge-based features. The technique was validated for identifying handwriters using both Arabic and English writings captured as scanned images using the IAM dataset for English handwriting and ICFHR 2012 dataset for Arabic handwriting. The results obtained clearly demonstrate the effectiveness of the proposed method when compared against some similar techniques. The third contribution is concerned with an offline handwriter identification approach based on the convolutional neural network technology. At the first stage, the Alex-Net architecture was employed to learn image features (handwritten scripts) and the features obtained from the fully connected layers of the model. Then, a Support vector machine classifier is deployed to classify the writing styles of the various handwriters. In this way, the test scripts can be classified by the CNN training model for further classification. The proposed approach was evaluated based on Arabic Historical datasets; Islamic Heritage Project (IHP) and Qatar National Library (QNL). The obtained results demonstrated that the proposed model achieved superior performances when compared to some similar method

    ONLINE ARABIC TEXT RECOGNITION USING STATISTICAL TECHNIQUES

    Get PDF

    Recognition of Cursive Arabic Handwritten Text using Embedded Training based on HMMs

    Get PDF
    In this paper we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models HMMs The system is analytical without explicit segmentation used embedded training to perform and enhance the character models Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image These features are modelled using hidden Markov models and trained by embedded training The experiments on images of the benchmark IFN ENIT database show that the proposed system improves recognitio

    Online handwriting Arabic recognition system using k-nearest neighbors classifier and DCT features

    Get PDF
    With advances in machine learning techniques, handwriting recognition systems have gained a great deal of importance. Lately, the increasing popularity of handheld computers, digital notebooks, and smartphones give the field of online handwriting recognition more interest. In this paper, we propose an enhanced method for the recognition of Arabic handwriting words using a directions-based segmentation technique and discrete cosine transform (DCT) coefficients as structural features. The main contribution of this research was combining a total of 18 structural features which were extracted by DCT coefficients and using the k-nearest neighbors (KNN) classifier to classify the segmented characters based on the extracted features. A dataset is used to validate the proposed method consisting of 2500 words in total. The obtained average 99.10% accuracy in recognition of handwritten characters shows that the proposed approach, through its multiple phases, is efficient in separating, distinguishing, and classifying Arabic handwritten characters using the KNN classifier. The availability of an online dataset of Arabic handwriting words is the main issue in this field. However, the dataset used will be available for research via the website
    • …
    corecore