2,755 research outputs found
Indic Handwritten Script Identification using Offline-Online Multimodal Deep Network
In this paper, we propose a novel approach of word-level Indic script
identification using only character-level data in training stage. The
advantages of using character level data for training have been outlined in
section I. Our method uses a multimodal deep network which takes both offline
and online modality of the data as input in order to explore the information
from both the modalities jointly for script identification task. We take
handwritten data in either modality as input and the opposite modality is
generated through intermodality conversion. Thereafter, we feed this
offline-online modality pair to our network. Hence, along with the advantage of
utilizing information from both the modalities, it can work as a single
framework for both offline and online script identification simultaneously
which alleviates the need for designing two separate script identification
modules for individual modality. One more major contribution is that we propose
a novel conditional multimodal fusion scheme to combine the information from
offline and online modality which takes into account the real origin of the
data being fed to our network and thus it combines adaptively. An exhaustive
experiment has been done on a data set consisting of English and six Indic
scripts. Our proposed framework clearly outperforms different frameworks based
on traditional classifiers along with handcrafted features and deep learning
based methods with a clear margin. Extensive experiments show that using only
character level training data can achieve state-of-art performance similar to
that obtained with traditional training using word level data in our framework.Comment: Accepted in Information Fusion, Elsevie
Neural Computing for Online Arabic Handwriting Character Recognition using Hard Stroke Features Mining
Online Arabic cursive character recognition is still a big challenge due to
the existing complexities including Arabic cursive script styles, writing
speed, writer mood and so forth. Due to these unavoidable constraints, the
accuracy of online Arabic character's recognition is still low and retain space
for improvement. In this research, an enhanced method of detecting the desired
critical points from vertical and horizontal direction-length of handwriting
stroke features of online Arabic script recognition is proposed. Each extracted
stroke feature divides every isolated character into some meaningful pattern
known as tokens. A minimum feature set is extracted from these tokens for
classification of characters using a multilayer perceptron with a
back-propagation learning algorithm and modified sigmoid function-based
activation function. In this work, two milestones are achieved; firstly, attain
a fixed number of tokens, secondly, minimize the number of the most repetitive
tokens. For experiments, handwritten Arabic characters are selected from the
OHASD benchmark dataset to test and evaluate the proposed method. The proposed
method achieves an average accuracy of 98.6% comparable in state of art
character recognition techniques.Comment: 16 page
Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition
The advent of recurrent neural networks for handwriting recognition marked an
important milestone reaching impressive recognition accuracies despite the
great variability that we observe across different writing styles. Sequential
architectures are a perfect fit to model text lines, not only because of the
inherent temporal aspect of text, but also to learn probability distributions
over sequences of characters and words. However, using such recurrent paradigms
comes at a cost at training stage, since their sequential pipelines prevent
parallelization. In this work, we introduce a non-recurrent approach to
recognize handwritten text by the use of transformer models. We propose a novel
method that bypasses any recurrence. By using multi-head self-attention layers
both at the visual and textual stages, we are able to tackle character
recognition as well as to learn language-related dependencies of the character
sequences to be decoded. Our model is unconstrained to any predefined
vocabulary, being able to recognize out-of-vocabulary words, i.e. words that do
not appear in the training vocabulary. We significantly advance over prior art
and demonstrate that satisfactory recognition accuracies are yielded even in
few-shot learning scenarios
Cursive Multilingual Characters Recognition Based on Hard Geometric Features
The cursive nature of multilingual characters segmentation and recognition of
Arabic, Persian, Urdu languages have attracted researchers from academia and
industry. However, despite several decades of research, still multilingual
characters classification accuracy is not up to the mark. This paper presents
an automated approach for multilingual characters segmentation and recognition.
The proposed methodology explores character based on their geometric features.
However, due to uncertainty and without dictionary support few characters are
over-divided. To expand the productivity of the proposed methodology a BPN is
prepared with countless division focuses for cursive multilingual characters.
Prepared BPN separates off base portioned indicates effectively with rapid
upgrade character acknowledgment precision. For reasonable examination, only
benchmark dataset is utilized.Comment: 1
A Review of Research on Devnagari Character Recognition
English Character Recognition (CR) has been extensively studied in the last
half century and progressed to a level, sufficient to produce technology driven
applications. But same is not the case for Indian languages which are
complicated in terms of structure and computations. Rapidly growing
computational power may enable the implementation of Indic CR methodologies.
Digital document processing is gaining popularity for application to office and
library automation, bank and postal services, publishing houses and
communication technology. Devnagari being the national language of India,
spoken by more than 500 million people, should be given special attention so
that document retrieval and analysis of rich ancient and modern Indian
literature can be effectively done. This article is intended to serve as a
guide and update for the readers, working in the Devnagari Optical Character
Recognition (DOCR) area. An overview of DOCR systems is presented and the
available DOCR techniques are reviewed. The current status of DOCR is discussed
and directions for future research are suggested.Comment: 8 pages, 1 Figure, 8 Tables, Journal pape
Designing Kernel Scheme for Classifiers Fusion
In this paper, we propose a special fusion method for combining ensembles of
base classifiers utilizing new neural networks in order to improve overall
efficiency of classification. While ensembles are designed such that each
classifier is trained independently while the decision fusion is performed as a
final procedure, in this method, we would be interested in making the fusion
process more adaptive and efficient. This new combiner, called Neural Network
Kernel Least Mean Square1, attempts to fuse outputs of the ensembles of
classifiers. The proposed Neural Network has some special properties such as
Kernel abilities,Least Mean Square features, easy learning over variants of
patterns and traditional neuron capabilities. Neural Network Kernel Least Mean
Square is a special neuron which is trained with Kernel Least Mean Square
properties. This new neuron is used as a classifiers combiner to fuse outputs
of base neural network classifiers. Performance of this method is analyzed and
compared with other fusion methods. The analysis represents higher performance
of our new method as opposed to others.Comment: 7 pages IEEE format, International Journal of Computer Science and
Information Security, IJCSIS November 2009, ISSN 1947 5500,
http://sites.google.com/site/ijcsis
A review on handwritten character and numeral recognition for Roman, Arabic, Chinese and Indian scripts
There are a lot of intensive researches on handwritten character recognition
(HCR) for almost past four decades. The research has been done on some of
popular scripts such as Roman, Arabic, Chinese and Indian. In this paper we
present a review on HCR work on the four popular scripts. We have summarized
most of the published paper from 2005 to recent and also analyzed the various
methods in creating a robust HCR system. We also added some future direction of
research on HCR.Comment: 8 page
Selective Distillation of Weakly Annotated GTD for Vision-based Slab Identification System
This paper proposes an algorithm for recognizing slab identification numbers
in factory scenes. In the development of a deep-learning based system, manual
labeling to make ground truth data (GTD) is an important but expensive task.
Furthermore, the quality of GTD is closely related to the performance of a
supervised learning algorithm. To reduce manual work in the labeling process,
we generated weakly annotated GTD by marking only character centroids. Whereas
bounding-boxes for characters require at least a drag-and-drop operation or two
clicks to annotate a character location, the weakly annotated GTD requires a
single click to record a character location. The main contribution of this
paper is on selective distillation to improve the quality of the weakly
annotated GTD. Because manual GTD are usually generated by many people, it may
contain personal bias or human error. To address this problem, the information
in manual GTD is integrated and refined by selective distillation. In the
process of selective distillation, a fully convolutional network is trained
using the weakly annotated GTD, and its prediction maps are selectively used to
revise locations and boundaries of semantic regions of characters in the
initial GTD. The modified GTD are used in the main training stage, and a
post-processing is conducted to retrieve text information. Experiments were
thoroughly conducted on actual industry data collected at a steelmaking factory
to demonstrate the effectiveness of the proposed method.Comment: 10 pages, 12 figures, submitted to a journa
Multiple models of Bayesian networks applied to offline recognition of Arabic handwritten city names
In this paper we address the problem of offline Arabic handwriting word
recognition. Off-line recognition of handwritten words is a difficult task due
to the high variability and uncertainty of human writing. The majority of the
recent systems are constrained by the size of the lexicon to deal with and the
number of writers. In this paper, we propose an approach for multi-writers
Arabic handwritten words recognition using multiple Bayesian networks. First,
we cut the image in several blocks. For each block, we compute a vector of
descriptors. Then, we use K-means to cluster the low-level features including
Zernik and Hu moments. Finally, we apply four variants of Bayesian networks
classifiers (Na\"ive Bayes, Tree Augmented Na\"ive Bayes (TAN), Forest
Augmented Na\"ive Bayes (FAN) and DBN (dynamic bayesian network) to classify
the whole image of tunisian city name. The results demonstrate FAN and DBN
outperform good recognition ratesComment: arXiv admin note: substantial text overlap with arXiv:1204.167
Evaluating Sequence-to-Sequence Models for Handwritten Text Recognition
Encoder-decoder models have become an effective approach for sequence
learning tasks like machine translation, image captioning and speech
recognition, but have yet to show competitive results for handwritten text
recognition. To this end, we propose an attention-based sequence-to-sequence
model. It combines a convolutional neural network as a generic feature
extractor with a recurrent neural network to encode both the visual
information, as well as the temporal context between characters in the input
image, and uses a separate recurrent neural network to decode the actual
character sequence. We make experimental comparisons between various attention
mechanisms and positional encodings, in order to find an appropriate alignment
between the input and output sequence. The model can be trained end-to-end and
the optional integration of a hybrid loss allows the encoder to retain an
interpretable and usable output, if desired. We achieve competitive results on
the IAM and ICFHR2016 READ data sets compared to the state-of-the-art without
the use of a language model, and we significantly improve over any recent
sequence-to-sequence approaches.Comment: 8 pages, 1 figure, 8 table
- …