864 research outputs found
Arabic Handwritten Words Off-line Recognition based on HMMs and DBNs
International audienceIn this work, we investigate the combination of PGM (Propabilistic Graphical Models) classifiers, either independent or coupled, for the recognition of Arabic handwritten words. The independent classifiers are vertical and horizontal HMMs (Hidden Markov Models) whose observable outputs are features extracted from the image columns and the image rows respectively. The coupled classifiers associate the vertical and horizontal observation streams into a single DBN (Dynamic Bayesian Network). A novel method to extract word baseline and a simple and easily extractable features to construct feature vectors for words in the vocabulary are proposed. Some of these features are statistical, based on pixel distributions and local pixel configurations. Others are structural, based on the presence of ascenders, descenders, loops and diacritic points. Experiments on handwritten Arabic words from IFN/ENIT strongly support the feasibility of the proposed approach. The recognition rates achieve 90.42% with vertical and horizontal HMM, 85.03% and 85.21% with respectively a first and a second DBN which outperform results of some works based on PGMs
Recommended from our members
Word based off-line handwritten Arabic classification and recognition. Design of automatic recognition system for large vocabulary offline handwritten Arabic words using machine learning approaches.
The design of a machine which reads unconstrained words still remains an unsolved problem. For example, automatic interpretation of handwritten documents by a computer is still under research. Most systems attempt to segment words into letters and read words one character at a time. However, segmenting handwritten words is very difficult. So to avoid this words are treated as a whole. This research investigates a number of features computed from whole words for the recognition of handwritten words in particular. Arabic text classification and recognition is a complicated process compared to Latin and Chinese text recognition systems. This is due to the nature cursiveness of Arabic text.
The work presented in this thesis is proposed for word based recognition of handwritten Arabic scripts. This work is divided into three main stages to provide a recognition system. The first stage is the pre-processing, which applies efficient pre-processing methods which are essential for automatic recognition of handwritten documents. In this stage, techniques for detecting baseline and segmenting words in handwritten Arabic text are presented. Then connected components are extracted, and distances between different components are analyzed. The statistical distribution of these distances is then obtained to determine an optimal threshold for word segmentation. The second stage is feature extraction. This stage makes use of the normalized images to extract features that are essential in recognizing the images. Various method of feature extraction are implemented and examined. The third and final stage is the classification. Various classifiers are used for classification such as K nearest neighbour classifier (k-NN), neural network classifier (NN), Hidden Markov models (HMMs), and the Dynamic Bayesian Network (DBN). To test this concept, the particular pattern recognition problem studied is the classification of 32492 words using
ii
the IFN/ENIT database. The results were promising and very encouraging in terms of improved baseline detection and word segmentation for further recognition. Moreover, several feature subsets were examined and a best recognition performance of 81.5% is achieved
Selection of Robust Features for Coin Recognition and Counterfeit Coin Detection
Tremendous numbers of coins have been used in our daily life since ancient times. Aside from being a medium of goods and services, coins are items most collected worldwide. Simultaneously to the increasing number of coins in use, the number of counterfeit coins released into circulation is on the rise. Some countries have started to take different security measures to detect and eliminate counterfeit coins. However, the current measures are very expensive and ineffective such as the case in UK which recently decided to replace the whole coin design and release a new coin incorporating a set of security features. The demands of a cost effective and robust computer-aided system to classify and authenticate those coins have increased as a result.
In this thesis, the design and implementation of coin recognition and counterfeit coin detection methods are proposed. This involves studying different coin stamp features and analyzing the sets of features that can uniquely and precisely differentiate coins of different countries and reject counterfeit coins. In addition, a new character segmentation method crafted for characters from coin images is proposed in this thesis. The proposed method for character segmentation is independent of the language of those characters. The experiments were performed on different coins with various characters and languages. The results show the effectiveness of the method to extract characters from different coins. The proposed method is the first to address character segmentation from coins. Coin recognition has been investigated in several research studies and different features have been selected for that purpose. This thesis proposes a new coin recognition method that focuses on small parts of the coin (characters) instead of extracting features from the whole coin image as proposed by other researchers. The method is evaluated on coins from different countries having different complexities, sizes, and qualities. The experimental results show that the proposed method compares favorably with other methods, and requires lower computational costs.
Counterfeit coin detection is more challenging than coin recognition where the differences between genuine and counterfeit coins are much smaller. The high quality forged coins are very similar to genuine coins, yet the coin stamp features are never identical. This thesis discusses two counterfeit coin detection methods based on different features. The first method consists of an ensemble of three classifiers, where a fine-tuned convolutional neural network is used to extract features from coins to train two classifiers. The third classifier is trained on features extracted from textual area of the coin.
On the other hand, sets of edge-based measures are used in the second method. Those measures are used to track differences in coin stamp’s edges between the test coin and a set of reference coins. A binary classifier is then trained based on the results of those measures. Finally, a series of experimental evaluation and tests have been performed to evaluate the effectiveness of these proposed methods, and they show that promising results have been achieved
Recommended from our members
Ensemble methods for instance-based Arabic language authorship attribution
The Authorship Attribution (AA) is considered as a subfield of authorship analysis and it is an important problem as the range of anonymous information increased with fast growing of internet usage worldwide. In other languages such as English, Spanish and Chinese, such issue is quite well studied. However, in Arabic language, the AA problem has received less attention from the research community due to complexity and nature of Arabic sentences. The paper presented an intensive review on previous studies for Arabic language. Based on that, this study has employed the Technique for Order Preferences by Similarity to Ideal Solution (TOPSIS) method to choose the base classifier of the ensemble methods. In terms of attribution features, hundreds of stylometric features and distinct words using several tools have been extracted. Then, Adaboost and Bagging ensemble methods have been applied on Arabic enquires (Fatwa) dataset. The findings showed an improvement of the effectiveness of the authorship attribution task in the Arabic language
Advances in Character Recognition
This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject
- …