187 research outputs found
A deformable elastic matching model for handwritten Chinesecharacter recognition
Conference Theme: Intelligent Systems for the 21st CenturyThis paper describes a deformable elastic matching approach to handwritten Chinese character recognition (HCCR). Handwritten character is regarded as a kind of deformable object, with elastic property. For the same category of character, we assume that different handwriting variations share the same topological structure, but may differ in shape details. The variations between different handwriting characters are modelled by a set of stroke displacement vectors (SDV). According to the SDV derived, a model character is deformed gradually, in an effort to transform itself much closer to an input character. Experiments show that the proposed elastic matching model can efficiently deal with local shape changes and variations between characters.published_or_final_versio
Extraction and optimization of B-spline PBD templates for recognition of connected handwritten digit strings
2001-2002 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe
Reconocimiento de notación matemática escrita a mano fuera de línea
El reconocimiento automático de expresiones matemáticas es uno de los problemas de reconocimiento de patrones, debido a que las matemáticas representan una fuente valiosa de información en muchos a ́reas de investigación. La escritura de expresiones matemáticas a mano es un medio de comunicación utilizado para la transmisión de información y conocimiento, con la cual se pueden generar de una manera sencilla escritos que contienen notación matemática. Este proceso puede volverse tedioso al ser escrito en lenguaje de composición tipográfica que pueda ser procesada por una computadora, tales como LATEX, MathML, entre otros. En los sistemas de reconocimiento de expresiones matem ́aticas existen dos m ́etodos diferentes a saber: fuera de l ́ınea y en l ́ınea. En esta tesis, se estudia el desempen ̃o de un sistema fuera de l ́ınea en donde se describen los pasos b ́asicos para lograr una mejor precisio ́n en el reconocimiento, las cuales esta ́n divididas en dos pasos principales: recono- cimiento de los s ́ımbolos de las ecuaciones matema ́ticas y el ana ́lisis de la estructura en que est ́an compuestos. Con el fin de convertir una expresi ́on matema ́tica escrita a mano en una expresio ́n equivalente en un sistema de procesador de texto, tal como TEX
Matching of complex patterns by energy minimization
Two patterns are matched by putting one on top of the other and iteratively moving their individual parts until most of their corresponding parts are aligned. An energy function and a neighborhood of influence are defined for each iteration. Initially, a large neighborhood is used such that the movements result in global features being coarsely aligned. The neighborhood size is gradually reduced in successive iterations so that finer and finer details are aligned. Encouraging results have been obtained when applied to match complex Chinese characters. It has been observed that computation increases with the square of the number of moving parts which is quite favorable compared with other algorithms. The method was applied to the recognition of handwritten Chinese characters. After performing the iterative matching, a set of similarity measures are used to measure the similarity in topological features between the input and template characters. An overall recognition rate of 96.1% is achieved. © 1998 IEEE.published_or_final_versio
Handwriting recognition using webcam for data entry
The Handwriting Recognition using Webcam for Data Entry project has its
primary purpose to develop a system or algorithm that is robust enough to recognize
numerical handwritings. A web camera is to be utilized to capture images of handwritten
scores and question numbers on the examination score sheet in real time. It is then
preprocessed and all the features are being fed into a neural network that is already been
trained by various test samples. The outcome of the project should be able to obtain a
system that is able to recognize handwritten numerical data with the lowest overshoot and
errors. Several distinctive feature from each character is extracted using a few feature
extraction methods, in which a comparison between three types of feature extraction
modules were used. The first test was done with a neural network trained with only the
Character Vector Module as its feature extraction method. A result that is far below the
set point of the recognition accuracy was achieved, a mere average of 64.67% accuracy.
However, the testing were later enhanced with another feature extraction module, which
consists of the combination of Character Vector Module, Kirsch Edge Detection Module,
Alphabet Profile Feature Extraction Module, Modified Character Module and Image
Compression Module. The modules have its distinct characteristics which is trained using
the Back-Propagation algorithm to cluster the pattern recognition capabilities among
different samples of handwriting. Several untrained samples of numerical handwritten
data were obtained at random from various people to be tested with the program. The
second tests shows far greater results compared to the first test, have yielded an average
of 84.52% accuracy. As the recognition results have not reached the target of 90%, further
feature extraction modules are being recommended and an additional feature extraction
module was added for the third test, which successfully yields 90.67%. With the timeframe
target achieved, a robust data entry system was developed using web camera
together with a user-friendly GUI (Graphical User Interface)
Recognition of off-line handwritten cursive text
The author presents novel algorithms to design unconstrained handwriting
recognition systems organized in three parts:
In Part One, novel algorithms are presented for processing of Arabic text prior to
recognition. Algorithms are described to convert a thinned image of a stroke to a straight
line approximation. Novel heuristic algorithms and novel theorems are presented to
determine start and end vertices of an off-line image of a stroke. A straight line
approximation of an off-line stroke is converted to a one-dimensional representation by
a novel algorithm which aims to recover the original sequence of writing. The resulting
ordering of the stroke segments is a suitable preprocessed representation for subsequent
handwriting recognition algorithms as it helps to segment the stroke. The algorithm was
tested against one data set of isolated handwritten characters and another data set of
cursive handwriting, each provided by 20 subjects, and has been 91.9% and 91.8%
successful for these two data sets, respectively.
In Part Two, an entirely novel fuzzy set-sequential machine character recognition
system is presented. Fuzzy sequential machines are defined to work as recognizers of
handwritten strokes. An algorithm to obtain a deterministic fuzzy sequential machine from
a stroke representation, that is capable of recognizing that stroke and its variants, is
presented. An algorithm is developed to merge two fuzzy machines into one machine. The
learning algorithm is a combination of many described algorithms. The system was tested
against isolated handwritten characters provided by 20 subjects resulting in 95.8%
recognition rate which is encouraging and shows that the system is highly flexible in
dealing with shape and size variations.
In Part Three, also an entirely novel text recognition system, capable of recognizing
off-line handwritten Arabic cursive text having a high variability is presented. This system
is an extension of the above recognition system. Tokens are extracted from a onedimensional
representation of a stroke. Fuzzy sequential machines are defined to work as
recognizers of tokens. It is shown how to obtain a deterministic fuzzy sequential machine
from a token representation that is capable'of recognizing that token and its variants. An
algorithm for token learning is presented. The tokens of a stroke are re-combined to
meaningful strings of tokens. Algorithms to recognize and learn token strings are
described. The. recognition stage uses algorithms of the learning stage. The process of
extracting the best set of basic shapes which represent the best set of token strings that
constitute an unknown stroke is described. A method is developed to extract lines from
pages of handwritten text, arrange main strokes of extracted lines in the same order as
they were written, and present secondary strokes to main strokes. Presented secondary
strokes are combined with basic shapes to obtain the final characters by formulating and
solving assignment problems for this purpose. Some secondary strokes which remain
unassigned are individually manipulated. The system was tested against the handwritings
of 20 subjects yielding overall subword and character recognition rates of 55.4% and
51.1%, respectively
Off-line Thai handwriting recognition in legal amount
Thai handwriting in legal amounts is a challenging problem and a new field in the area of handwriting recognition research. The focus of this thesis is to implement Thai handwriting recognition system. A preliminary data set of Thai handwriting in legal amounts is designed. The samples in the data set are characters and words of the Thai legal amounts and a set of legal amounts phrases collected from a number of native Thai volunteers. At the preprocessing and recognition process, techniques are introduced to improve the characters recognition rates. The characters are divided into two smaller subgroups by their writing levels named body and high groups. The recognition rates of both groups are increased based on their distinguished features. The writing level separation algorithms are implemented using the size and position of characters. Empirical experiments are set to test the best combination of the feature to increase the recognition rates. Traditional recognition systems are modified to give the accumulative top-3 ranked answers to cover the possible character classes. At the postprocessing process level, the lexicon matching algorithms are implemented to match the ranked characters with the legal amount words. These matched words are joined together to form possible choices of amounts. These amounts will have their syntax checked in the last stage. Several syntax violations are caused by consequence faulty character segmentation and recognition resulting from connecting or broken characters. The anomaly in handwriting caused by these characters are mainly detected by their size and shape. During the recovery process, the possible word boundary patterns can be pre-defined and used to segment the hypothesis words. These words are identified by the word recognition and the results are joined with previously matched words to form the full amounts and checked by the syntax rules again. From 154 amounts written by 10 writers, the rejection rate is 14.9 percent with the recovery processes. The recognition rate for the accepted amount is 100 percent
- …