31 research outputs found
Recognition of Cursive Arabic Handwritten Text using Embedded Training based on HMMs
In this paper we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models HMMs The system is analytical without explicit segmentation used embedded training to perform and enhance the character models Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image These features are modelled using hidden Markov models and trained by embedded training The experiments on images of the benchmark IFN ENIT database show that the proposed system improves recognitio
Effect of system parameters on feature extraction sets for Arabic handwritten text recognition
The purpose of this study is to analyze the effect of different parameters on a Pattern
Recognition System for Arabic Handwritten Text Recognition, and perform different
experimental tests in order to obtain the optimum values for the process.
In the first introductory section, source data material and the tools used in this
work are introduced and explained.
The thesis then focuses on the Feature Extraction Process, providing details
about different strategies or methods that can be used on the process. In the experimental
section, the most important test results are given and the variable parameters
are individually analyzed. Finally, different combination schemes are implemented
in order to prove the effectiveness of the Slanted Windows.
The results provide some support for the correct selection of parameter values
for future implementations of the system. However, the optimum parameter values
should not be considered as absolute values, due to the fact that the aim is to guide
the researchers for future implementations of the system.Ingeniería de TelecomunicaciónTelekomunikazio Ingeniaritz
Segmentation-free Word Spotting for Handwritten Arabic Documents
In this paper we present an unsupervised segmentation-free method for spotting and searching query, especially, for images documents in handwritten Arabic, for this, Histograms of Oriented Gradients (HOGs) are used as the feature vectors to represent the query and documents image. Then, we compress the descriptors with the product quantization method. Finally, a better representation of the query is obtained by using the Support Vector Machines (SVM)
Recognition of Cursive Arabic Handwritten Text using Embedded Training based on HMMs
In this paper we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models (HMMs). The system is analytical without explicit segmentation used embedded training to perform and enhance the character models. Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image. These features are modelled using hidden Markov models and trained by embedded training. The experiments on images of the benchmark IFN/ENIT database show that the proposed system improves recognition
Off line Arabic handwritten character using neural network
— Character Recognition (CR) considered as one of the
most important in the field of pattern recognition. The ultimate
objectives of the Optical Character Recognition (OCR) system is
to simulate the capability of reading, hence the OCR considered
as artificial intelligence. In this paper, a character-handwritten
recognition for the Arabic language is developed. The main aim
of the system is to save time and effort Arabic OCR. In addition,
to be the alternative of the typing manual due to provide it fast
and reliable. The system has four main stages; preprocessing,
segmentation, feature extraction, classification, and recognition.
The system is off-line and depends on the image acquisition. So,
after acquitted the image has to go through the main stages. The
Neural Network used as a classifier. The proposed system is able
to recognize as many characters as can with high accuracy rate.
In addition, it is focusing on the character that has similarities
and the system will also be considered about the number of dots
and its position, and the connected components
Recommended from our members
Word based off-line handwritten Arabic classification and recognition. Design of automatic recognition system for large vocabulary offline handwritten Arabic words using machine learning approaches.
The design of a machine which reads unconstrained words still remains an unsolved problem. For example, automatic interpretation of handwritten documents by a computer is still under research. Most systems attempt to segment words into letters and read words one character at a time. However, segmenting handwritten words is very difficult. So to avoid this words are treated as a whole. This research investigates a number of features computed from whole words for the recognition of handwritten words in particular. Arabic text classification and recognition is a complicated process compared to Latin and Chinese text recognition systems. This is due to the nature cursiveness of Arabic text.
The work presented in this thesis is proposed for word based recognition of handwritten Arabic scripts. This work is divided into three main stages to provide a recognition system. The first stage is the pre-processing, which applies efficient pre-processing methods which are essential for automatic recognition of handwritten documents. In this stage, techniques for detecting baseline and segmenting words in handwritten Arabic text are presented. Then connected components are extracted, and distances between different components are analyzed. The statistical distribution of these distances is then obtained to determine an optimal threshold for word segmentation. The second stage is feature extraction. This stage makes use of the normalized images to extract features that are essential in recognizing the images. Various method of feature extraction are implemented and examined. The third and final stage is the classification. Various classifiers are used for classification such as K nearest neighbour classifier (k-NN), neural network classifier (NN), Hidden Markov models (HMMs), and the Dynamic Bayesian Network (DBN). To test this concept, the particular pattern recognition problem studied is the classification of 32492 words using
ii
the IFN/ENIT database. The results were promising and very encouraging in terms of improved baseline detection and word segmentation for further recognition. Moreover, several feature subsets were examined and a best recognition performance of 81.5% is achieved
Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform
In this research, off-line handwriting recognition system for Arabic alphabet is
introduced. The system contains three main stages: preprocessing, segmentation and
recognition stage. In the preprocessing stage, Radon transform was used in the design
of algorithms for page, line and word skew correction as well as for word slant
correction. In the segmentation stage, Hough transform approach was used for line
extraction. For line to words and word to characters segmentation, a statistical method
using mathematic representation of the lines and words binary image was used.
Unlike most of current handwriting recognition system, our system simulates the
human mechanism for image recognition, where images are encoded and saved in
memory as groups according to their similarity to each other. Characters are
decomposed into a coefficient vectors, using fast wavelet transform, then, vectors,
that represent a character in different possible shapes, are saved as groups with one
representative for each group. The recognition is achieved by comparing a vector of
the character to be recognized with group representatives.
Experiments showed that the proposed system is able to achieve the recognition task
with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a
single character in a text of 15 lines where each line has 10 words on average