3,010 research outputs found
A probabilistic framework for handwritten text line segmentation
We successfully combine Expectation-Maximization algorithm and variational
approaches for parameter learning and computing inference on Markov random
felds. This is a general method that can be applied to many computer vision
tasks. In this paper, we apply it to handwritten text line segmentation. We
conduct several experiments that demonstrate that our method deal with common
issues of this task, such as complex document layout or non-latin scripts. The
obtained results prove that our method achieve state-of-the-art performance on
different benchmark datasets without any particular fine tuning step.Comment: 47 pages, 23 image
HMM-based Writer Identification in Music Score Documents without Staff-Line Removal
Writer identification from musical score documents is a challenging task due
to its inherent problem of overlapping of musical symbols with staff lines.
Most of the existing works in the literature of writer identification in
musical score documents were performed after a preprocessing stage of staff
lines removal. In this paper we propose a novel writer identification framework
in musical documents without removing staff lines from documents. In our
approach, Hidden Markov Model has been used to model the writing style of the
writers without removing staff lines. The sliding window features are extracted
from musical score lines and they are used to build writer specific HMM models.
Given a query musical sheet, writer specific confidence for each musical line
is returned by each writer specific model using a loglikelihood score. Next, a
loglikelihood score in page level is computed by weighted combination of these
scores from the corresponding line images of the page. A novel Factor Analysis
based feature selection technique is applied in sliding window features to
reduce the noise appearing from staff lines which proves efficiency in writer
identification performance.In our framework we have also proposed a novel score
line detection approach in musical sheet using HMM. The experiment has been
performed in CVC-MUSCIMA dataset and the results obtained that the proposed
approach is efficient for score line detection and writer identification
without removing staff lines. To get the idea of computation time of our
method, detail analysis of execution time is also provided.Comment: Expert Systems with Applications, Elsevier(2017
Measuring Human Perception to Improve Handwritten Document Transcription
The subtleties of human perception, as measured by vision scientists through
the use of psychophysics, are important clues to the internal workings of
visual recognition. For instance, measured reaction time can indicate whether a
visual stimulus is easy for a subject to recognize, or whether it is hard. In
this paper, we consider how to incorporate psychophysical measurements of
visual perception into the loss function of a deep neural network being trained
for a recognition task, under the assumption that such information can enforce
consistency with human behavior. As a case study to assess the viability of
this approach, we look at the problem of handwritten document transcription.
While good progress has been made towards automatically transcribing modern
handwriting, significant challenges remain in transcribing historical
documents. Here we describe a general enhancement strategy, underpinned by the
new loss formulation, which can be applied to the training regime of any deep
learning-based document transcription system. Through experimentation, reliable
performance improvement is demonstrated for the standard IAM and RIMES datasets
for three different network architectures. Further, we go on to show
feasibility for our approach on a new dataset of digitized Latin manuscripts,
originally produced by scribes in the Cloister of St. Gall in the the 9th
century
Scene Text Recognition with Sliding Convolutional Character Models
Scene text recognition has attracted great interests from the computer vision
and pattern recognition community in recent years. State-of-the-art methods use
concolutional neural networks (CNNs), recurrent neural networks with long
short-term memory (RNN-LSTM) or the combination of them. In this paper, we
investigate the intrinsic characteristics of text recognition, and inspired by
human cognition mechanisms in reading texts, we propose a scene text
recognition method with character models on convolutional feature map. The
method simultaneously detects and recognizes characters by sliding the text
line image with character models, which are learned end-to-end on text line
images labeled with text transcripts. The character classifier outputs on the
sliding windows are normalized and decoded with Connectionist Temporal
Classification (CTC) based algorithm. Compared to previous methods, our method
has a number of appealing properties: (1) It avoids the difficulty of character
segmentation which hinders the performance of segmentation-based recognition
methods; (2) The model can be trained simply and efficiently because it avoids
gradient vanishing/exploding in training RNN-LSTM based models; (3) It bases on
character models trained free of lexicon, and can recognize unknown words. (4)
The recognition process is highly parallel and enables fast recognition. Our
experiments on several challenging English and Chinese benchmarks, including
the IIIT-5K, SVT, ICDAR03/13 and TRW15 datasets, demonstrate that the proposed
method yields superior or comparable performance to state-of-the-art methods
while the model size is relatively small.Comment: 10 pages,4 figure
The State of the Art Recognize in Arabic Script through Combination of Online and Offline
Handwriting recognition refers to the identification of written characters.
Handwriting recognition has become an acute research area in recent years for
the ease of access of computer science. In this paper primarily discussed
On-line and Off-line handwriting recognition methods for Arabic words which are
often used among then across the Middle East and North Africa People. Arabic
word online handwriting recognition is a very challenging task due to its
cursive nature. Because of the characteristic of the whole body of the Arabic
script, namely connectivity between the characters, thereby the segmentation of
An Arabic script is very difficult. In this paper we introduced an Arabic
script multiple classifier system for recognizing notes written on a Starboard.
This Arabic script multiple classifier system combines one off-line and on-line
handwriting recognition systems. The Arabic script recognizers are all based on
Hidden Markov Models but vary in the way of preprocessing and normalization. To
combine the Arabic script output sequences of the recognizers, we incrementally
align the word sequences using a norm string matching algorithm. The Arabic
script combination we could increase the system performance over the excellent
character recognizer by about 3%. The proposed technique is also the necessary
step towards character recognition, person identification, personality
determination where input data is processed from all perspectives.Comment: Pages 7, Figure 6, Table 2. arXiv admin note: text overlap with
arXiv:1110.1488 by other author
Handwritten Character Recognition In Malayalam Scripts- A Review
Handwritten character recognition is one of the most challenging and ongoing
areas of research in the field of pattern recognition. HCR research is matured
for foreign languages like Chinese and Japanese but the problem is much more
complex for Indian languages. The problem becomes even more complicated for
South Indian languages due to its large character set and the presence of
vowels modifiers and compound characters. This paper provides an overview of
important contributions and advances in offline as well as online handwritten
character recognition of Malayalam scripts.Comment: 11 pages,4 figures,2 table
Parsimonious HMMs for Offline Handwritten Chinese Text Recognition
Recently, hidden Markov models (HMMs) have achieved promising results for
offline handwritten Chinese text recognition. However, due to the large
vocabulary of Chinese characters with each modeled by a uniform and fixed
number of hidden states, a high demand of memory and computation is required.
In this study, to address this issue, we present parsimonious HMMs via the
state tying which can fully utilize the similarities among different Chinese
characters. Two-step algorithm with the data-driven question-set is adopted to
generate the tied-state pool using the likelihood measure. The proposed
parsimonious HMMs with both Gaussian mixture models (GMMs) and deep neural
networks (DNNs) as the emission distributions not only lead to a compact model
but also improve the recognition accuracy via the data sharing for the tied
states and the confusion decreasing among state classes. Tested on ICDAR-2013
competition database, in the best configured case, the new parsimonious DNN-HMM
can yield a relative character error rate (CER) reduction of 6.2%, 25%
reduction of model size and 60% reduction of decoding time over the
conventional DNN-HMM. In the compact setting case of average 1-state HMM, our
parsimonious DNN-HMM significantly outperforms the conventional DNN-HMM with a
relative CER reduction of 35.5%.Comment: Accepted by ICFHR201
Fully Convolutional Recurrent Network for Handwritten Chinese Text Recognition
This paper proposes an end-to-end framework, namely fully convolutional
recurrent network (FCRN) for handwritten Chinese text recognition (HCTR).
Unlike traditional methods that rely heavily on segmentation, our FCRN is
trained with online text data directly and learns to associate the pen-tip
trajectory with a sequence of characters. FCRN consists of four parts: a
path-signature layer to extract signature features from the input pen-tip
trajectory, a fully convolutional network to learn informative representation,
a sequence modeling layer to make per-frame predictions on the input sequence
and a transcription layer to translate the predictions into a label sequence.
The FCRN is end-to-end trainable in contrast to conventional methods whose
components are separately trained and tuned. We also present a refined beam
search method that efficiently integrates the language model to decode the FCRN
and significantly improve the recognition results.
We evaluate the performance of the proposed method on the test sets from the
databases CASIA-OLHWDB and ICDAR 2013 Chinese handwriting recognition
competition, and both achieve state-of-the-art performance with correct rates
of 96.40% and 95.00%, respectively.Comment: 6 pages, 3 figures, 5 table
Attribute CNNs for Word Spotting in Handwritten Documents
Word spotting has become a field of strong research interest in document
image analysis over the last years. Recently, AttributeSVMs were proposed which
predict a binary attribute representation. At their time, this influential
method defined the state-of-the-art in segmentation-based word spotting. In
this work, we present an approach for learning attribute representations with
Convolutional Neural Networks (CNNs). By taking a probabilistic perspective on
training CNNs, we derive two different loss functions for binary and
real-valued word string embeddings. In addition, we propose two different CNN
architectures, specifically designed for word spotting. These architectures are
able to be trained in an end-to-end fashion. In a number of experiments, we
investigate the influence of different word string embeddings and optimization
strategies. We show our Attribute CNNs to achieve state-of-the-art results for
segmentation-based word spotting on a large variety of data sets.Comment: under review at IJDA
A General Framework for the Recognition of Online Handwritten Graphics
We propose a new framework for the recognition of online handwritten
graphics. Three main features of the framework are its ability to treat symbol
and structural level information in an integrated way, its flexibility with
respect to different families of graphics, and means to control the tradeoff
between recognition effectiveness and computational cost. We model a graphic as
a labeled graph generated from a graph grammar. Non-terminal vertices represent
subcomponents, terminal vertices represent symbols, and edges represent
relations between subcomponents or symbols. We then model the recognition
problem as a graph parsing problem: given an input stroke set, we search for a
parse tree that represents the best interpretation of the input. Our graph
parsing algorithm generates multiple interpretations (consistent with the
grammar) and then we extract an optimal interpretation according to a cost
function that takes into consideration the likelihood scores of symbols and
structures. The parsing algorithm consists in recursively partitioning the
stroke set according to structures defined in the grammar and it does not
impose constraints present in some previous works (e.g. stroke ordering). By
avoiding such constraints and thanks to the powerful representativeness of
graphs, our approach can be adapted to the recognition of different graphic
notations. We show applications to the recognition of mathematical expressions
and flowcharts. Experimentation shows that our method obtains state-of-the-art
accuracy in both applications.Comment: Submitted to TPAM
- …