Search CORE

140 research outputs found

A comprehensive survey of handwritten document benchmarks: structure, usage and evaluation

Author
Publication venue: Springer
Publication date: 24/12/2015
Field of study

Arabic Handwritten Words Off-line Recognition based on HMMs and DBNs

Author: Belaïd Abdel
Elloumi Mourad
Kacem Afef
Khémiri Akram
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/08/2015
Field of study

International audienceIn this work, we investigate the combination of PGM (Propabilistic Graphical Models) classifiers, either independent or coupled, for the recognition of Arabic handwritten words. The independent classifiers are vertical and horizontal HMMs (Hidden Markov Models) whose observable outputs are features extracted from the image columns and the image rows respectively. The coupled classifiers associate the vertical and horizontal observation streams into a single DBN (Dynamic Bayesian Network). A novel method to extract word baseline and a simple and easily extractable features to construct feature vectors for words in the vocabulary are proposed. Some of these features are statistical, based on pixel distributions and local pixel configurations. Others are structural, based on the presence of ascenders, descenders, loops and diacritic points. Experiments on handwritten Arabic words from IFN/ENIT strongly support the feasibility of the proposed approach. The recognition rates achieve 90.42% with vertical and horizontal HMM, 85.03% and 85.21% with respectively a first and a second DBN which outperform results of some works based on PGMs

Crossref

INRIA a CCSD electronic archive server

Recommended from our members

Word based off-line handwritten Arabic classification and recognition. Design of automatic recognition system for large vocabulary offline handwritten Arabic words using machine learning approaches.

Author: AlKhateeb Jawad H.Y.
Publication venue: Department of Electronic Imaging and Media Communications
Publication date: 01/01/2010
Field of study

The design of a machine which reads unconstrained words still remains an unsolved problem. For example, automatic interpretation of handwritten documents by a computer is still under research. Most systems attempt to segment words into letters and read words one character at a time. However, segmenting handwritten words is very difficult. So to avoid this words are treated as a whole. This research investigates a number of features computed from whole words for the recognition of handwritten words in particular. Arabic text classification and recognition is a complicated process compared to Latin and Chinese text recognition systems. This is due to the nature cursiveness of Arabic text. The work presented in this thesis is proposed for word based recognition of handwritten Arabic scripts. This work is divided into three main stages to provide a recognition system. The first stage is the pre-processing, which applies efficient pre-processing methods which are essential for automatic recognition of handwritten documents. In this stage, techniques for detecting baseline and segmenting words in handwritten Arabic text are presented. Then connected components are extracted, and distances between different components are analyzed. The statistical distribution of these distances is then obtained to determine an optimal threshold for word segmentation. The second stage is feature extraction. This stage makes use of the normalized images to extract features that are essential in recognizing the images. Various method of feature extraction are implemented and examined. The third and final stage is the classification. Various classifiers are used for classification such as K nearest neighbour classifier (k-NN), neural network classifier (NN), Hidden Markov models (HMMs), and the Dynamic Bayesian Network (DBN). To test this concept, the particular pattern recognition problem studied is the classification of 32492 words using ii the IFN/ENIT database. The results were promising and very encouraging in terms of improved baseline detection and word segmentation for further recognition. Moreover, several feature subsets were examined and a best recognition performance of 81.5% is achieved

Bradford Scholars

A comprehensive survey of handwritten document benchmarks: structure, usage and evaluation

Author: A Bensefia
A Fischer
A Giménez
A Schlapbach
A Shivram
A-HM R
A-L Bianne-Bernard
Ahsen Raza
AK Jain
B Verma
B Zhu
C-L Liu
Chawki Djeddi
CO Freitas
D Bertolini
D-H Wang
E Kavallieratou
E Kussul
EF Can
F H-C
F Lauer
F Zamora-Martanez
GE Hinton
GX Tan
H Bunke
H El-Abed
H El-Abed
H Liu
H Yamada
I Siddiqi
Imran Siddiqi
JJ Hull
K Seo
Khurram Khurshid
L C-L
L Jin
L Xu
L Z
M Bulacu
M Liwicki
M Nakagawa
M Nakagawa
M Shi
MA Mohamed
MN Abdi
N Serrano
NB Amara
Q-F Wang
R Saabni
Raashid Hussain
S Al-Maadeed
S Gunter
SJ Smith
T-H Su
TM Ha
U Bhattacharya
UV Marti
V Frinken
Y Al-Ohali
Y Kessentini
Y LeCun
Y Shao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform

Author: E. GUMAH MOHAMED
Publication venue
Publication date: 01/01/2010
Field of study

In this research, off-line handwriting recognition system for Arabic alphabet is introduced. The system contains three main stages: preprocessing, segmentation and recognition stage. In the preprocessing stage, Radon transform was used in the design of algorithms for page, line and word skew correction as well as for word slant correction. In the segmentation stage, Hough transform approach was used for line extraction. For line to words and word to characters segmentation, a statistical method using mathematic representation of the lines and words binary image was used. Unlike most of current handwriting recognition system, our system simulates the human mechanism for image recognition, where images are encoded and saved in memory as groups according to their similarity to each other. Characters are decomposed into a coefficient vectors, using fast wavelet transform, then, vectors, that represent a character in different possible shapes, are saved as groups with one representative for each group. The recognition is achieved by comparing a vector of the character to be recognized with group representatives. Experiments showed that the proposed system is able to achieve the recognition task with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a single character in a text of 15 lines where each line has 10 words on average

UTPedia

ONLINE ARABIC TEXT RECOGNITION USING STATISTICAL TECHNIQUES

Author
Publication venue
Publication date
Field of study