462 research outputs found
The State of the Art Recognize in Arabic Script through Combination of Online and Offline
Handwriting recognition refers to the identification of written characters.
Handwriting recognition has become an acute research area in recent years for
the ease of access of computer science. In this paper primarily discussed
On-line and Off-line handwriting recognition methods for Arabic words which are
often used among then across the Middle East and North Africa People. Arabic
word online handwriting recognition is a very challenging task due to its
cursive nature. Because of the characteristic of the whole body of the Arabic
script, namely connectivity between the characters, thereby the segmentation of
An Arabic script is very difficult. In this paper we introduced an Arabic
script multiple classifier system for recognizing notes written on a Starboard.
This Arabic script multiple classifier system combines one off-line and on-line
handwriting recognition systems. The Arabic script recognizers are all based on
Hidden Markov Models but vary in the way of preprocessing and normalization. To
combine the Arabic script output sequences of the recognizers, we incrementally
align the word sequences using a norm string matching algorithm. The Arabic
script combination we could increase the system performance over the excellent
character recognizer by about 3%. The proposed technique is also the necessary
step towards character recognition, person identification, personality
determination where input data is processed from all perspectives.Comment: Pages 7, Figure 6, Table 2. arXiv admin note: text overlap with
arXiv:1110.1488 by other author
Neural Computing for Online Arabic Handwriting Character Recognition using Hard Stroke Features Mining
Online Arabic cursive character recognition is still a big challenge due to
the existing complexities including Arabic cursive script styles, writing
speed, writer mood and so forth. Due to these unavoidable constraints, the
accuracy of online Arabic character's recognition is still low and retain space
for improvement. In this research, an enhanced method of detecting the desired
critical points from vertical and horizontal direction-length of handwriting
stroke features of online Arabic script recognition is proposed. Each extracted
stroke feature divides every isolated character into some meaningful pattern
known as tokens. A minimum feature set is extracted from these tokens for
classification of characters using a multilayer perceptron with a
back-propagation learning algorithm and modified sigmoid function-based
activation function. In this work, two milestones are achieved; firstly, attain
a fixed number of tokens, secondly, minimize the number of the most repetitive
tokens. For experiments, handwritten Arabic characters are selected from the
OHASD benchmark dataset to test and evaluate the proposed method. The proposed
method achieves an average accuracy of 98.6% comparable in state of art
character recognition techniques.Comment: 16 page
A multi-stream hmm approach to offline handwritten arabic word recognition
In This paper we presented new approach for cursive Arabic text recognition
system. The objective is to propose methodology analytical offline recognition
of handwritten Arabic for rapid implementation. The first part in the writing
recognition system is the preprocessing phase is the preprocessing phase to
prepare the data was introduces and extracts a set of simple statistical
features by two methods : from a window which is sliding long that text line
the right to left and the approach VH2D (consists in projecting every character
on the abscissa, on the ordinate and the diagonals 45{\deg} and 135{\deg}) . It
then injects the resulting feature vectors to Hidden Markov Model (HMM) and
combined the two HMM by multi-stream approach.Comment: 12 pages,13 figure,International Journal on Natural Language
Computing(IJNLC),ISSN:2278-1307[Online];2319-4111[Print],August 2013, Volume
2, Number
Large Vocabulary Arabic Online Handwriting Recognition System
Arabic handwriting is a consonantal and cursive writing. The analysis of
Arabic script is further complicated due to obligatory dots/strokes that are
placed above or below most letters and usually written delayed in order. Due to
ambiguities and diversities of writing styles, recognition systems are
generally based on a set of possible words called lexicon. When the lexicon is
small, recognition accuracy is more important as the recognition time is
minimal. On the other hand, recognition speed as well as the accuracy are both
critical when handling large lexicons. Arabic is rich in morphology and syntax
which makes its lexicon large. Therefore, a practical online handwriting
recognition system should be able to handle a large lexicon with reasonable
performance in terms of both accuracy and time. In this paper, we introduce a
fully-fledged Hidden Markov Model (HMM) based system for Arabic online
handwriting recognition that provides solutions for most of the difficulties
inherent in recognizing the Arabic script. A new preprocessing technique for
handling the delayed strokes is introduced. We use advanced modeling techniques
for building our recognition system from the training data to provide more
detailed representation for the differences between the writing units, minimize
the variances between writers in the training data and have a better
representation for the features space. System results are enhanced using an
additional post-processing step with a higher order language model and
cross-word HMM models. The system performance is evaluated using two different
databases covering small and large lexicons. Our system outperforms the
state-of-art systems for the small lexicon database. Furthermore, it shows
promising results (accuracy and time) when supporting large lexicon with the
possibility for adapting the models for specific writers to get even better
results.Comment: Preprint submitted to Pattern Analysis and Applications Journa
A Study of Sindhi Related and Arabic Script Adapted languages Recognition
A large number of publications are available for the Optical Character
Recognition (OCR). Significant researches, as well as articles are present for
the Latin, Chinese and Japanese scripts. Arabic script is also one of mature
script from OCR perspective. The adaptive languages which share Arabic script
or its extended characters; still lacking the OCRs for their language. In this
paper we present the efforts of researchers on Arabic and its related and
adapted languages. This survey is organized in different sections, in which
introduction is followed by properties of Sindhi Language. OCR process
techniques and methods used by various researchers are presented. The last
section is dedicated for future work and conclusion is also discussed.Comment: 11 pages, 8 Figures, Sindh Univ. Res. Jour. (Sci. Ser.
Text Line Segmentation of Historical Documents: a Survey
There is a huge amount of historical documents in libraries and in various
National Archives that have not been exploited electronically. Although
automatic reading of complete pages remains, in most cases, a long-term
objective, tasks such as word spotting, text/image alignment, authentication
and extraction of specific fields are in use today. For all these tasks, a
major step is document segmentation into text lines. Because of the low quality
and the complexity of these documents (background noise, artifacts due to
aging, interfering lines),automatic text line segmentation remains an open
research field. The objective of this paper is to present a survey of existing
methods, developed during the last decade, and dedicated to documents of
historical interest.Comment: 25 pages, submitted version, To appear in International Journal on
Document Analysis and Recognition, On line version available at
http://www.springerlink.com/content/k2813176280456k3
A review on handwritten character and numeral recognition for Roman, Arabic, Chinese and Indian scripts
There are a lot of intensive researches on handwritten character recognition
(HCR) for almost past four decades. The research has been done on some of
popular scripts such as Roman, Arabic, Chinese and Indian. In this paper we
present a review on HCR work on the four popular scripts. We have summarized
most of the published paper from 2005 to recent and also analyzed the various
methods in creating a robust HCR system. We also added some future direction of
research on HCR.Comment: 8 page
Large Scale Font Independent Urdu Text Recognition System
OCR algorithms have received a significant improvement in performance
recently, mainly due to the increase in the capabilities of artificial
intelligence algorithms. However, this advancement is not evenly distributed
over all languages. Urdu is among the languages which did not receive much
attention, especially in the font independent perspective. There exists no
automated system that can reliably recognize printed Urdu text in images and
videos across different fonts. To help bridge this gap, we have developed
Qaida, a large scale data set with 256 fonts, and a complete Urdu lexicon. We
have also developed a Convolutional Neural Network (CNN) based classification
model which can recognize Urdu ligatures with 84.2% accuracy. Moreover, we
demonstrate that our recognition network can not only recognize the text in the
fonts it is trained on but can also reliably recognize text in unseen (new)
fonts. To this end, this paper makes following contributions: (i) we introduce
a large scale, multiple fonts based data set for printed Urdu text
recognition;(ii) we have designed, trained and evaluated a CNN based model for
Urdu text recognition; (iii) we experiment with incremental learning methods to
produce state-of-the-art results for Urdu text recognition. All the experiment
choices were thoroughly validated via detailed empirical analysis. We believe
that this study can serve as the basis for further improvement in the
performance of font independent Urdu OCR systems
Online Decision Process based on Machine Learning Techniques
This paper analyses role of internet in marketing and its influences on
business decision-making process. It explains how the decision maker collect
variety of information about customers through internet and analysis this data
to better use it in enhancing the processes and the overall performance of the
organization. In addition, how each department in an organization collaborates
and use these information through data warehousing. Accordingly, a business
intelligence model is proposed for web segmentation that divides potential
markets or consumers into specific groups and analysis them for better decision
making. The model further plans to push the significance of web opportunities
in directing the web division and gathering client information. It is exhibited
how marketing information system include customers, equipment and procedures
analysis contribute to help decision makers make better decision
Recurrent Neural Network Method in Arabic Words Recognition System
The recognition of unconstrained handwriting continues to be a difficult task
for computers despite active research for several decades. This is because
handwritten text offers great challenges such as character and word
segmentation, character recognition, variation between handwriting styles,
different character size and no font constraints as well as the background
clarity. In this paper primarily discussed Online Handwriting Recognition
methods for Arabic words which being often used among then across the Middle
East and North Africa people. Because of the characteristic of the whole body
of the Arabic words, namely connectivity between the characters, thereby the
segmentation of An Arabic word is very difficult. We introduced a recurrent
neural network to online handwriting Arabic word recognition. The key
innovation is a recently produce recurrent neural networks objective function
known as connectionist temporal classification. The system consists of an
advanced recurrent neural network with an output layer designed for sequence
labeling, partially combined with a probabilistic language model. Experimental
results show that unconstrained Arabic words achieve recognition rates about
79%, which is significantly higher than the about 70% using a previously
developed hidden markov model based recognition system.Comment: 6 Pages, 5 Figures, Vol. 3, Issue 11, pages 43-4
- …