507 research outputs found
A discrete contextual stochastic model for the off-line recognition of handwritten Chinese characters
We study a discrete contextual stochastic (CS) model for complex and variant patterns like handwritten Chinese characters. Three fundamental problems of using CS models for character recognition are discussed, and several practical techniques for solving these problems are investigated. A formulation for discriminative training of CS model parameters is also introduced and its practical usage investigated. To illustrate the characteristics of the various algorithms, comparative experiments are performed on a recognition task with a vocabulary consisting of 50 pairs of highly similar handwritten Chinese characters. The experimental results confirm the effectiveness of the discriminative training for improving recognition performance.published_or_final_versio
Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition
Online handwritten Chinese text recognition (OHCTR) is a challenging problem
as it involves a large-scale character set, ambiguous segmentation, and
variable-length input sequences. In this paper, we exploit the outstanding
capability of path signature to translate online pen-tip trajectories into
informative signature feature maps using a sliding window-based method,
successfully capturing the analytic and geometric properties of pen strokes
with strong local invariance and robustness. A multi-spatial-context fully
convolutional recurrent network (MCFCRN) is proposed to exploit the multiple
spatial contexts from the signature feature maps and generate a prediction
sequence while completely avoiding the difficult segmentation problem.
Furthermore, an implicit language model is developed to make predictions based
on semantic context within a predicting feature sequence, providing a new
perspective for incorporating lexicon constraints and prior knowledge about a
certain language in the recognition procedure. Experiments on two standard
benchmarks, Dataset-CASIA and Dataset-ICDAR, yielded outstanding results, with
correct rates of 97.10% and 97.15%, respectively, which are significantly
better than the best result reported thus far in the literature.Comment: 14 pages, 9 figure
Open Set Chinese Character Recognition using Multi-typed Attributes
Recognition of Off-line Chinese characters is still a challenging problem,
especially in historical documents, not only in the number of classes extremely
large in comparison to contemporary image retrieval methods, but also new
unseen classes can be expected under open learning conditions (even for CNN).
Chinese character recognition with zero or a few training samples is a
difficult problem and has not been studied yet. In this paper, we propose a new
Chinese character recognition method by multi-type attributes, which are based
on pronunciation, structure and radicals of Chinese characters, applied to
character recognition in historical books. This intermediate attribute code has
a strong advantage over the common `one-hot' class representation because it
allows for understanding complex and unseen patterns symbolically using
attributes. First, each character is represented by four groups of attribute
types to cover a wide range of character possibilities: Pinyin label, layout
structure, number of strokes, three different input methods such as Cangjie,
Zhengma and Wubi, as well as a four-corner encoding method. A convolutional
neural network (CNN) is trained to learn these attributes. Subsequently,
characters can be easily recognized by these attributes using a distance metric
and a complete lexicon that is encoded in attribute space. We evaluate the
proposed method on two open data sets: printed Chinese character recognition
for zero-shot learning, historical characters for few-shot learning and a
closed set: handwritten Chinese characters. Experimental results show a good
general classification of seen classes but also a very promising generalization
ability to unseen characters.Comment: 29 pages, submitted to Pattern Recognitio
Advances in Character Recognition
This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject
- …