25 research outputs found
Does color modalities affect handwriting recognition? An empirical study on Persian handwritings using convolutional neural networks
Most of the methods on handwritten recognition in the literature are focused
and evaluated on Black and White (BW) image databases. In this paper we try to
answer a fundamental question in document recognition. Using Convolutional
Neural Networks (CNNs), as eye simulator, we investigate to see whether color
modalities of handwritten digits and words affect their recognition accuracy or
speed? To the best of our knowledge, so far this question has not been answered
due to the lack of handwritten databases that have all three color modalities
of handwritings. To answer this question, we selected 13,330 isolated digits
and 62,500 words from a novel Persian handwritten database, which have three
different color modalities and are unique in term of size and variety. Our
selected datasets are divided into training, validation, and testing sets.
Afterwards, similar conventional CNN models are trained with the training
samples. While the experimental results on the testing set show that CNN on the
BW digit and word images has a higher performance compared to the other two
color modalities, in general there are no significant differences for network
accuracy in different color modalities. Also, comparisons of training times in
three color modalities show that recognition of handwritten digits and words in
BW images using CNN is much more efficient
Holistic Farsi handwritten word recognition using gradient features
In this paper we address the issue of recognizing Farsi handwritten words. Two types of gradient features are extracted from a sliding vertical stripe which sweeps across a word image. These are directional and intensity gradient features. The feature vector extracted from each stripe is then coded using the Self Organizing Map (SOM). In this method each word is modeled using the discrete Hidden Markov Model (HMM). To evaluate the performance of the proposed method, FARSA dataset has been used. The experimental results show that the proposed system, applying directional gradient features, has achieved the recognition rate of 69.07% and outperformed all other existing methods
A discrete hidden Markov model for the recognition of handwritten Farsi words
Handwriting recognition systems (HRS) have been researched for more than 50 years. Designing a system to recognize specific words in a handwritten clean document is still a difficult task and the challenge is to achieve a high recognition rate. Previously, most of the research in the handwriting recognition domain was conducted on Chinese and Latin languages, while recently more people have shown an interest in the Indo-Iranian script recognition systems. In this thesis, we present an automatic handwriting recognition system for Farsi words. The system was trained, validated and tested on the CENPARMI Farsi Dataset, which was gathered during this research. CENPARMI's Farsi Dataset is unique in terms of its huge number of images (432,357 combined grayscale and binary), inclusion of all possible handwriting types (Dates, Words, Isolated Characters, Isolated Digits, Numeral Strings, Special Symbols, Documents), the variety of cursive styles, the number of writers (400) and the exclusive participation of Native Farsi speakers in the gathering of data. The words were first preprocessed. Concavity and Distribution features were extracted and the codebook was calculated by the vector quantization method. A Discrete Hidden Markov Model was chosen as the classifier because of the cursive nature of the Farsi script. Finally, encouraging recognition rates of98.76% and 96.02% have been obtained for the Training and Testing sets, respectivel
End-Shape Analysis for Automatic Segmentation of Arabic Handwritten Texts
Word segmentation is an important task for many methods that are related to document understanding especially word spotting and word recognition. Several approaches of word segmentation have been proposed for Latin-based languages while a few of them have been introduced for Arabic texts. The fact that Arabic writing is cursive by nature and unconstrained with no clear boundaries between the words makes the processing of Arabic handwritten text a more challenging problem.
In this thesis, the design and implementation of an End-Shape Letter (ESL) based segmentation system for Arabic handwritten text is presented. This incorporates four novel aspects: (i) removal of secondary components, (ii) baseline estimation, (iii) ESL recognition, and (iv) the creation of a new off-line CENPARMI ESL database.
Arabic texts include small connected components, also called secondary components. Removing these components can improve the performance of several systems such as baseline estimation. Thus, a robust method to remove secondary components that takes into consideration the challenges in the Arabic handwriting is introduced. The methods reconstruct the image based on some criteria. The results of this method were subsequently compared with those of two other methods that used the same database. The results show that the proposed method is effective.
Baseline estimation is a challenging task for Arabic texts since it includes ligature, overlapping, and secondary components. Therefore, we propose a learning-based approach that addresses these challenges. Our method analyzes the image and extracts baseline dependent features. Then, the baseline is estimated using a classifier.
Algorithms dealing with text segmentation usually analyze the gaps between connected components. These algorithms are based on metric calculation, finding threshold, and/or gap classification. We use two well-known metrics: bounding box and convex hull to test metric-based method on Arabic handwritten texts, and to include this technique in our approach. To determine the threshold, an unsupervised learning approach, known as the Gaussian Mixture Model, is used. Our ESL-based segmentation approach extracts the final letter of a word using rule-based technique and recognizes these letters using the implemented ESL classifier.
To demonstrate the benefit of text segmentation, a holistic word spotting system is implemented. For this system, a word recognition system is implemented. A series of experiments with different sets of features are conducted. The system shows promising results
Ensemble learning using multi-objective optimisation for arabic handwritten words
Arabic handwriting recognition is a dynamic and stimulating field of study within
pattern recognition. This system plays quite a significant part in today's global
environment. It is a widespread and computationally costly function due to cursive
writing, a massive number of words, and writing style. Based on the literature, the
existing features lack data supportive techniques and building geometric features.
Most ensemble learning approaches are based on the assumption of linear
combination, which is not valid due to differences in data types. Also, the existing
approaches of classifier generation do not support decision-making for selecting the
most suitable classifier, and it requires enabling multi-objective optimisation to handle
these differences in data types. In this thesis, new type of feature for handwriting using
Segments Interpolation (SI) to find the best fitting line in each of the windows with a
model for finding the best operating point window size for SI features. Multi-Objective
Ensemble Oriented (MOEO) formulated to control the classifier topology and provide
feedback support for changing the classifiers' topology and weights based on the
extension of Non-dominated Sorting Genetic Algorithm (NSGA-II). It is designated
as the Random Subset based Parents Selection (RSPS-NSGA-II) to handle neurons
and accuracy. Evaluation metrics from two perspectives classification and Multiobjective
optimization. The experimental design based on two subsets of the
IFN/ENIT database. The first one consists of 10 classes (C10) and 22 classes (C22).
The features were tested with Support Vector Machine (SVM) and Extreme Learning
Machine (ELM). This work improved due to the SI feature. SI shows a significant
result with SVM with 88.53% for C22. RSPS for C10 at k=2 achieved 91% accuracy
with fewer neurons than NSGA-II, and for C22 at k=10, accuracy has been increased
81% compared to NSGA-II 78%. Future work may consider introducing more features
to the system, applying them to other languages, and integrating it with sequence
learning for more accuracy
Recommended from our members
Word based off-line handwritten Arabic classification and recognition. Design of automatic recognition system for large vocabulary offline handwritten Arabic words using machine learning approaches.
The design of a machine which reads unconstrained words still remains an unsolved problem. For example, automatic interpretation of handwritten documents by a computer is still under research. Most systems attempt to segment words into letters and read words one character at a time. However, segmenting handwritten words is very difficult. So to avoid this words are treated as a whole. This research investigates a number of features computed from whole words for the recognition of handwritten words in particular. Arabic text classification and recognition is a complicated process compared to Latin and Chinese text recognition systems. This is due to the nature cursiveness of Arabic text.
The work presented in this thesis is proposed for word based recognition of handwritten Arabic scripts. This work is divided into three main stages to provide a recognition system. The first stage is the pre-processing, which applies efficient pre-processing methods which are essential for automatic recognition of handwritten documents. In this stage, techniques for detecting baseline and segmenting words in handwritten Arabic text are presented. Then connected components are extracted, and distances between different components are analyzed. The statistical distribution of these distances is then obtained to determine an optimal threshold for word segmentation. The second stage is feature extraction. This stage makes use of the normalized images to extract features that are essential in recognizing the images. Various method of feature extraction are implemented and examined. The third and final stage is the classification. Various classifiers are used for classification such as K nearest neighbour classifier (k-NN), neural network classifier (NN), Hidden Markov models (HMMs), and the Dynamic Bayesian Network (DBN). To test this concept, the particular pattern recognition problem studied is the classification of 32492 words using
ii
the IFN/ENIT database. The results were promising and very encouraging in terms of improved baseline detection and word segmentation for further recognition. Moreover, several feature subsets were examined and a best recognition performance of 81.5% is achieved
Novel word recognition and word spotting systems for offline Urdu handwriting
Word recognition for offline Arabic, Farsi and Urdu handwriting is a subject which has attained much attention in the OCR field. This thesis presents the implementations of offline Urdu Handwritten Word Recognition (HWR) and an Urdu word spotting technique. This thesis first introduces the creation of several offline CENPARMI Urdu databases. These databases were necessary for offline Urdu HWR experiments. The holistic-based recognition approach was followed for the Urdu HWR system. In this system, the basic pre-processing of images was performed. In the feature extraction phase, the gradient and structural features were extracted from greyscale and binary word images, respectively. This recognition system extracted 592 feature sets and these features helped in improving the recognition results. The system was trained and tested on 57 words. Overall, we achieved a 97 % accuracy rate for handwritten word recognition by using the SVM classifier. Our word spotting technique used the holistic HWR system for recognition purposes. This word spotting system consisted of two processes: the segmentation of handwritten connected components and diacritics from Urdu text lines and the word spotting algorithm. A small database of handwritten text pages was created for testing the word spotting system. This database consisted of texts from ten Urdu native speakers. The rule-based segmentation system was applied for segmentation (or extracting) for handwritten Urdu subwords or connected components from text lines. We achieved a 92% correct segmentation rate for 372 text lines. In the word spotting algorithm, the candidate words were generated from the segmented connected components. These candidate words were sent to the holistic HWR system, which extracted the features and tried to recognize each image as one of the 57 words. After classification, each image was sent to the verification/rejection phase, which helped in rejecting the maximum number of unseen (raw data) images. Overall, we achieved a 50% word spotting precision at a 70% recall rat