1,482 research outputs found
Component-based Segmentation of words from handwritten Arabic text
Efficient preprocessing is very essential for automatic recognition of handwritten documents. In this paper, techniques on segmenting words in handwritten Arabic text are presented. Firstly, connected components (ccs) are extracted, and distances among different components are analyzed. The statistical distribution of this distance is then obtained to determine an optimal threshold for words segmentation. Meanwhile, an improved projection based method is also employed for baseline detection. The proposed method has been successfully tested on IFN/ENIT database consisting of 26459 Arabic words handwritten by 411 different writers, and the results were promising and very encouraging in more accurate detection of the baseline and segmentation of words for further recognition
Recognition of off-line handwritten cursive text
The author presents novel algorithms to design unconstrained handwriting
recognition systems organized in three parts:
In Part One, novel algorithms are presented for processing of Arabic text prior to
recognition. Algorithms are described to convert a thinned image of a stroke to a straight
line approximation. Novel heuristic algorithms and novel theorems are presented to
determine start and end vertices of an off-line image of a stroke. A straight line
approximation of an off-line stroke is converted to a one-dimensional representation by
a novel algorithm which aims to recover the original sequence of writing. The resulting
ordering of the stroke segments is a suitable preprocessed representation for subsequent
handwriting recognition algorithms as it helps to segment the stroke. The algorithm was
tested against one data set of isolated handwritten characters and another data set of
cursive handwriting, each provided by 20 subjects, and has been 91.9% and 91.8%
successful for these two data sets, respectively.
In Part Two, an entirely novel fuzzy set-sequential machine character recognition
system is presented. Fuzzy sequential machines are defined to work as recognizers of
handwritten strokes. An algorithm to obtain a deterministic fuzzy sequential machine from
a stroke representation, that is capable of recognizing that stroke and its variants, is
presented. An algorithm is developed to merge two fuzzy machines into one machine. The
learning algorithm is a combination of many described algorithms. The system was tested
against isolated handwritten characters provided by 20 subjects resulting in 95.8%
recognition rate which is encouraging and shows that the system is highly flexible in
dealing with shape and size variations.
In Part Three, also an entirely novel text recognition system, capable of recognizing
off-line handwritten Arabic cursive text having a high variability is presented. This system
is an extension of the above recognition system. Tokens are extracted from a onedimensional
representation of a stroke. Fuzzy sequential machines are defined to work as
recognizers of tokens. It is shown how to obtain a deterministic fuzzy sequential machine
from a token representation that is capable'of recognizing that token and its variants. An
algorithm for token learning is presented. The tokens of a stroke are re-combined to
meaningful strings of tokens. Algorithms to recognize and learn token strings are
described. The. recognition stage uses algorithms of the learning stage. The process of
extracting the best set of basic shapes which represent the best set of token strings that
constitute an unknown stroke is described. A method is developed to extract lines from
pages of handwritten text, arrange main strokes of extracted lines in the same order as
they were written, and present secondary strokes to main strokes. Presented secondary
strokes are combined with basic shapes to obtain the final characters by formulating and
solving assignment problems for this purpose. Some secondary strokes which remain
unassigned are individually manipulated. The system was tested against the handwritings
of 20 subjects yielding overall subword and character recognition rates of 55.4% and
51.1%, respectively
Exploration of Various Fractional Order Derivatives in Parkinson's Disease Dysgraphia Analysis
Parkinson's disease (PD) is a common neurodegenerative disorder with a
prevalence rate estimated to 2.0% for people aged over 65 years. Cardinal motor
symptoms of PD such as rigidity and bradykinesia affect the muscles involved in
the handwriting process resulting in handwriting abnormalities called PD
dysgraphia. Nowadays, online handwritten signal (signal with temporal
information) acquired by the digitizing tablets is the most advanced approach
of graphomotor difficulties analysis. Although the basic kinematic features
were proved to effectively quantify the symptoms of PD dysgraphia, a recent
research identified that the theory of fractional calculus can be used to
improve the graphomotor difficulties analysis. Therefore, in this study, we
follow up on our previous research, and we aim to explore the utilization of
various approaches of fractional order derivative (FD) in the analysis of PD
dysgraphia. For this purpose, we used the repetitive loops task from the
Parkinson's disease handwriting database (PaHaW). Handwritten signals were
parametrized by the kinematic features employing three FD approximations:
Gr\"unwald-Letnikov's, Riemann-Liouville's, and Caputo's. Results of the
correlation analysis revealed a significant relationship between the clinical
state and the handwriting features based on the velocity. The extracted
features by Caputo's FD approximation outperformed the rest of the analyzed FD
approaches. This was also confirmed by the results of the classification
analysis, where the best model trained by Caputo's handwriting features
resulted in a balanced accuracy of 79.73% with a sensitivity of 83.78% and a
specificity of 75.68%.Comment: Print ISBN 978-3-031-19744-
Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform
In this research, off-line handwriting recognition system for Arabic alphabet is
introduced. The system contains three main stages: preprocessing, segmentation and
recognition stage. In the preprocessing stage, Radon transform was used in the design
of algorithms for page, line and word skew correction as well as for word slant
correction. In the segmentation stage, Hough transform approach was used for line
extraction. For line to words and word to characters segmentation, a statistical method
using mathematic representation of the lines and words binary image was used.
Unlike most of current handwriting recognition system, our system simulates the
human mechanism for image recognition, where images are encoded and saved in
memory as groups according to their similarity to each other. Characters are
decomposed into a coefficient vectors, using fast wavelet transform, then, vectors,
that represent a character in different possible shapes, are saved as groups with one
representative for each group. The recognition is achieved by comparing a vector of
the character to be recognized with group representatives.
Experiments showed that the proposed system is able to achieve the recognition task
with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a
single character in a text of 15 lines where each line has 10 words on average
- …