221 research outputs found
Handwritten Character Recognition of South Indian Scripts: A Review
Handwritten character recognition is always a frontier area of research in
the field of pattern recognition and image processing and there is a large
demand for OCR on hand written documents. Even though, sufficient studies have
performed in foreign scripts like Chinese, Japanese and Arabic characters, only
a very few work can be traced for handwritten character recognition of Indian
scripts especially for the South Indian scripts. This paper provides an
overview of offline handwritten character recognition in South Indian Scripts,
namely Malayalam, Tamil, Kannada and Telungu.Comment: Paper presented on the "National Conference on Indian Language
Computing", Kochi, February 19-20, 2011. 6 pages, 5 figure
Off-Line Handwritten Arabic Characters Segmentation Using Slant-Tolerant Segment Features (STSF) [PJ6123. S562 2007 f rb].
Tema utama bagi kajian ini ialah pensegmenan aksara tulisan Arab luar talian. Suatu sistem pengecaman aksara tulisan Arab yang baik mampu meningkatkan kesalingtindakan antara manusia dengan komputer.
The main theme of this research is the off-line handwritten Arabic characters segmentation. A successful handwritten Arabic character recognition system improves interactivity between the human and the computers
Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform
In this research, off-line handwriting recognition system for Arabic alphabet is
introduced. The system contains three main stages: preprocessing, segmentation and
recognition stage. In the preprocessing stage, Radon transform was used in the design
of algorithms for page, line and word skew correction as well as for word slant
correction. In the segmentation stage, Hough transform approach was used for line
extraction. For line to words and word to characters segmentation, a statistical method
using mathematic representation of the lines and words binary image was used.
Unlike most of current handwriting recognition system, our system simulates the
human mechanism for image recognition, where images are encoded and saved in
memory as groups according to their similarity to each other. Characters are
decomposed into a coefficient vectors, using fast wavelet transform, then, vectors,
that represent a character in different possible shapes, are saved as groups with one
representative for each group. The recognition is achieved by comparing a vector of
the character to be recognized with group representatives.
Experiments showed that the proposed system is able to achieve the recognition task
with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a
single character in a text of 15 lines where each line has 10 words on average
A Bottom Up Procedure for Text Line Segmentation of Latin Script
In this paper we present a bottom up procedure for segmentation of text lines
written or printed in the Latin script. The proposed method uses a combination
of image morphology, feature extraction and Gaussian mixture model to perform
this task. The experimental results show the validity of the procedure.Comment: Accepted and presented at the IEEE conference "International
Conference on Advances in Computing, Communications and Informatics (ICACCI)
2017
Recognition of off-line handwritten cursive text
The author presents novel algorithms to design unconstrained handwriting
recognition systems organized in three parts:
In Part One, novel algorithms are presented for processing of Arabic text prior to
recognition. Algorithms are described to convert a thinned image of a stroke to a straight
line approximation. Novel heuristic algorithms and novel theorems are presented to
determine start and end vertices of an off-line image of a stroke. A straight line
approximation of an off-line stroke is converted to a one-dimensional representation by
a novel algorithm which aims to recover the original sequence of writing. The resulting
ordering of the stroke segments is a suitable preprocessed representation for subsequent
handwriting recognition algorithms as it helps to segment the stroke. The algorithm was
tested against one data set of isolated handwritten characters and another data set of
cursive handwriting, each provided by 20 subjects, and has been 91.9% and 91.8%
successful for these two data sets, respectively.
In Part Two, an entirely novel fuzzy set-sequential machine character recognition
system is presented. Fuzzy sequential machines are defined to work as recognizers of
handwritten strokes. An algorithm to obtain a deterministic fuzzy sequential machine from
a stroke representation, that is capable of recognizing that stroke and its variants, is
presented. An algorithm is developed to merge two fuzzy machines into one machine. The
learning algorithm is a combination of many described algorithms. The system was tested
against isolated handwritten characters provided by 20 subjects resulting in 95.8%
recognition rate which is encouraging and shows that the system is highly flexible in
dealing with shape and size variations.
In Part Three, also an entirely novel text recognition system, capable of recognizing
off-line handwritten Arabic cursive text having a high variability is presented. This system
is an extension of the above recognition system. Tokens are extracted from a onedimensional
representation of a stroke. Fuzzy sequential machines are defined to work as
recognizers of tokens. It is shown how to obtain a deterministic fuzzy sequential machine
from a token representation that is capable'of recognizing that token and its variants. An
algorithm for token learning is presented. The tokens of a stroke are re-combined to
meaningful strings of tokens. Algorithms to recognize and learn token strings are
described. The. recognition stage uses algorithms of the learning stage. The process of
extracting the best set of basic shapes which represent the best set of token strings that
constitute an unknown stroke is described. A method is developed to extract lines from
pages of handwritten text, arrange main strokes of extracted lines in the same order as
they were written, and present secondary strokes to main strokes. Presented secondary
strokes are combined with basic shapes to obtain the final characters by formulating and
solving assignment problems for this purpose. Some secondary strokes which remain
unassigned are individually manipulated. The system was tested against the handwritings
of 20 subjects yielding overall subword and character recognition rates of 55.4% and
51.1%, respectively
- …