8,009 research outputs found
Neural Computing for Online Arabic Handwriting Character Recognition using Hard Stroke Features Mining
Online Arabic cursive character recognition is still a big challenge due to
the existing complexities including Arabic cursive script styles, writing
speed, writer mood and so forth. Due to these unavoidable constraints, the
accuracy of online Arabic character's recognition is still low and retain space
for improvement. In this research, an enhanced method of detecting the desired
critical points from vertical and horizontal direction-length of handwriting
stroke features of online Arabic script recognition is proposed. Each extracted
stroke feature divides every isolated character into some meaningful pattern
known as tokens. A minimum feature set is extracted from these tokens for
classification of characters using a multilayer perceptron with a
back-propagation learning algorithm and modified sigmoid function-based
activation function. In this work, two milestones are achieved; firstly, attain
a fixed number of tokens, secondly, minimize the number of the most repetitive
tokens. For experiments, handwritten Arabic characters are selected from the
OHASD benchmark dataset to test and evaluate the proposed method. The proposed
method achieves an average accuracy of 98.6% comparable in state of art
character recognition techniques.Comment: 16 page
A Framework for On-Line Devanagari Handwritten Character Recognition
The main challenge in on-line handwritten character recognition in Indian
lan- guage is the large size of the character set, larger similarity between
different characters in the script and the huge variation in writing style. In
this paper we propose a framework for on-line handwitten script recognition
taking cues from speech signal processing literature. The framework is based on
identify- ing strokes, which in turn lead to recognition of handwritten on-line
characters rather that the conventional character identification. Though the
framework is described for Devanagari script, the framework is general and can
be applied to any language.
The proposed platform consists of pre-processing, feature extraction, recog-
nition and post processing like the conventional character recognition but ap-
plied to strokes. The on-line Devanagari character recognition reduces to one
of recognizing one of 69 primitives and recognition of a character is performed
by recognizing a sequence of such primitives. We further show the impact of
noise removal on on-line raw data which is usually noisy. The use of Fuzzy
Direc- tional Features to enhance the accuracy of stroke recognition is also
described. The recognition results are compared with commonly used directional
features in literature using several classifiers.Comment: 29 page
A Medial Axis Based Thinning Strategy for Character Images
Thinning of character images is a big challenge. Removal of strokes or
deformities in thinning is a difficult problem. In this paper, we have proposed
a medial axis based thinning strategy used for performing skeletonization of
printed and handwritten character images. In this method, we have used shape
characteristics of text to get skeleton of nearly same as the true character
shape. This approach helps to preserve the local features and true shape of the
character images. The proposed algorithm produces one pixel width thin
skeleton. As a by-product of our thinning approach, the skeleton also gets
segmented into strokes in vector form. Hence further stroke segmentation is not
required. Experiment is done on printed English and Bengali characters and we
obtain less spurious branches comparing with other thinning methods without any
post processing.Comment: 6 pages, 5 figures. In proceedings of the second National Conference
on Computer Vision, Pattern Recognition, Image Processing and Graphics
(NCVPRIPG), pp. 67-72, Jaipur, India, 201
Online Handwritten Devanagari Stroke Recognition Using Extended Directional Features
This paper describes a new feature set, called the extended directional
features (EDF) for use in the recognition of online handwritten strokes. We use
EDF specifically to recognize strokes that form a basis for producing
Devanagari script, which is the most widely used Indian language script. It
should be noted that stroke recognition in handwritten script is equivalent to
phoneme recognition in speech signals and is generally very poor and of the
order of 20% for singing voice. Experiments are conducted for the automatic
recognition of isolated handwritten strokes. Initially we describe the proposed
feature set, namely EDF and then show how this feature can be effectively
utilized for writer independent script recognition through stroke recognition.
Experimental results show that the extended directional feature set performs
well with about 65+% stroke level recognition accuracy for writer independent
data set.Comment: 8th International Conference on Signal Processing and Communication
Systems 15 - 17 December 2014, Gold Coast, Australi
Indic Handwritten Script Identification using Offline-Online Multimodal Deep Network
In this paper, we propose a novel approach of word-level Indic script
identification using only character-level data in training stage. The
advantages of using character level data for training have been outlined in
section I. Our method uses a multimodal deep network which takes both offline
and online modality of the data as input in order to explore the information
from both the modalities jointly for script identification task. We take
handwritten data in either modality as input and the opposite modality is
generated through intermodality conversion. Thereafter, we feed this
offline-online modality pair to our network. Hence, along with the advantage of
utilizing information from both the modalities, it can work as a single
framework for both offline and online script identification simultaneously
which alleviates the need for designing two separate script identification
modules for individual modality. One more major contribution is that we propose
a novel conditional multimodal fusion scheme to combine the information from
offline and online modality which takes into account the real origin of the
data being fed to our network and thus it combines adaptively. An exhaustive
experiment has been done on a data set consisting of English and six Indic
scripts. Our proposed framework clearly outperforms different frameworks based
on traditional classifiers along with handcrafted features and deep learning
based methods with a clear margin. Extensive experiments show that using only
character level training data can achieve state-of-art performance similar to
that obtained with traditional training using word level data in our framework.Comment: Accepted in Information Fusion, Elsevie
Recurrent neural networks based Indic word-wise script identification using character-wise training
This paper presents a novel methodology of Indic handwritten script
recognition using Recurrent Neural Networks and addresses the problem of script
recognition in poor data scenarios, such as when only character level online
data is available. It is based on the hypothesis that curves of online
character data comprise sufficient information for prediction at the word
level. Online character data is used to train RNNs using BLSTM architecture
which are then used to make predictions of online word level data. These
prediction results on the test set are at par with prediction results of models
trained with online word data, while the training of the character level model
is much less data intensive and takes much less time. Performance for
binary-script models and then 5 Indic script models are reported, along with
comparison with HMM models.The system is extended for offline data prediction.
Raw offline data lacks the temporal information available in online data and
required for prediction using models trained with online data. To overcome
this, stroke recovery is implemented and the strokes are utilized for
predicting using the online character level models. The performance on
character and word level offline data is reported.Comment: Version accepted at ICPRS 201
Handwritten Chinese Font Generation with Collaborative Stroke Refinement
Automatic character generation is an appealing solution for new typeface
design, especially for Chinese typefaces including over 3700 most commonly-used
characters. This task has two main pain points: (i) handwritten characters are
usually associated with thin strokes of few information and complex structure
which are error prone during deformation; (ii) thousands of characters with
various shapes are needed to synthesize based on a few manually designed
characters. To solve those issues, we propose a novel
convolutional-neural-network-based model with three main techniques:
collaborative stroke refinement, using collaborative training strategy to
recover the missing or broken strokes; online zoom-augmentation, taking the
advantage of the content-reuse phenomenon to reduce the size of training set;
and adaptive pre-deformation, standardizing and aligning the characters. The
proposed model needs only 750 paired training samples; no pre-trained network,
extra dataset resource or labels is needed. Experimental results show that the
proposed method significantly outperforms the state-of-the-art methods under
the practical restriction on handwritten font synthesis.Comment: 8 pages(exclude reference
DeepWriting: Making Digital Ink Editable via Deep Generative Modeling
Digital ink promises to combine the flexibility and aesthetics of handwriting
and the ability to process, search and edit digital text. Character recognition
converts handwritten text into a digital representation, albeit at the cost of
losing personalized appearance due to the technical difficulties of separating
the interwoven components of content and style. In this paper, we propose a
novel generative neural network architecture that is capable of disentangling
style from content and thus making digital ink editable. Our model can
synthesize arbitrary text, while giving users control over the visual
appearance (style). For example, allowing for style transfer without changing
the content, editing of digital ink at the word level and other application
scenarios such as spell-checking and correction of handwritten text. We
furthermore contribute a new dataset of handwritten text with fine-grained
annotations at the character level and report results from an initial user
evaluation
Handwritten Character Recognition In Malayalam Scripts- A Review
Handwritten character recognition is one of the most challenging and ongoing
areas of research in the field of pattern recognition. HCR research is matured
for foreign languages like Chinese and Japanese but the problem is much more
complex for Indian languages. The problem becomes even more complicated for
South Indian languages due to its large character set and the presence of
vowels modifiers and compound characters. This paper provides an overview of
important contributions and advances in offline as well as online handwritten
character recognition of Malayalam scripts.Comment: 11 pages,4 figures,2 table
Stroke extraction for offline handwritten mathematical expression recognition
Offline handwritten mathematical expression recognition is often considered
much harder than its online counterpart due to the absence of temporal
information. In order to take advantage of the more mature methods for online
recognition and save resources, an oversegmentation approach is proposed to
recover strokes from textual bitmap images automatically. The proposed
algorithm first breaks down the skeleton of a binarized image into junctions
and segments, then segments are merged to form strokes, finally stroke order is
normalized by using recursive projection and topological sort. Good offline
accuracy was obtained in combination with ordinary online recognizers, which
are not specially designed for extracted strokes. Given a ready-made
state-of-the-art online handwritten mathematical expression recognizer, the
proposed procedure correctly recognized 58.22%, 65.65%, and 65.22% of the
offline formulas rendered from the datasets of the Competitions on Recognition
of Online Handwritten Mathematical Expressions(CROHME) in 2014, 2016, and 2019
respectively. Furthermore, given a trainable online recognition system,
retraining it with extracted strokes resulted in an offline recognizer with the
same level of accuracy. On the other hand, the speed of the entire pipeline was
fast enough to facilitate on-device recognition on mobile phones with limited
resources. To conclude, stroke extraction provides an attractive way to build
optical character recognition software.Comment: 22 pages, 7 figure
- …