11 research outputs found
A Study On the Use of 8-Directional Features For Online Handwritten Chinese Character Recognition
published_or_final_versio
Sparse arrays of signatures for online character recognition
In mathematics the signature of a path is a collection of iterated integrals,
commonly used for solving differential equations. We show that the path
signature, used as a set of features for consumption by a convolutional neural
network (CNN), improves the accuracy of online character recognition---that is
the task of reading characters represented as a collection of paths. Using
datasets of letters, numbers, Assamese and Chinese characters, we show that the
first, second, and even the third iterated integrals contain useful information
for consumption by a CNN.
On the CASIA-OLHWDB1.1 3755 Chinese character dataset, our approach gave a
test error of 3.58%, compared with 5.61% for a traditional CNN [Ciresan et
al.]. A CNN trained on the CASIA-OLHWDB1.0-1.2 datasets won the ICDAR2013
Online Isolated Chinese Character recognition competition.
Computationally, we have developed a sparse CNN implementation that make it
practical to train CNNs with many layers of max-pooling. Extending the MNIST
dataset by translations, our sparse CNN gets a test error of 0.31%.Comment: 10 pages, 2 figure
An Open Source Testing Tool for Evaluating Handwriting Input Methods
This paper presents an open source tool for testing the recognition accuracy
of Chinese handwriting input methods. The tool consists of two modules, namely
the PC and Android mobile client. The PC client reads handwritten samples in
the computer, and transfers them individually to the Android client in
accordance with the socket communication protocol. After the Android client
receives the data, it simulates the handwriting on screen of client device, and
triggers the corresponding handwriting recognition method. The recognition
accuracy is recorded by the Android client. We present the design principles
and describe the implementation of the test platform. We construct several test
datasets for evaluating different handwriting recognition systems, and conduct
an objective and comprehensive test using six Chinese handwriting input methods
with five datasets. The test results for the recognition accuracy are then
compared and analyzed.Comment: 5 pages, 3 figures, 11 tables. Accepted to appear at ICDAR 201
Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition
Online handwritten Chinese text recognition (OHCTR) is a challenging problem
as it involves a large-scale character set, ambiguous segmentation, and
variable-length input sequences. In this paper, we exploit the outstanding
capability of path signature to translate online pen-tip trajectories into
informative signature feature maps using a sliding window-based method,
successfully capturing the analytic and geometric properties of pen strokes
with strong local invariance and robustness. A multi-spatial-context fully
convolutional recurrent network (MCFCRN) is proposed to exploit the multiple
spatial contexts from the signature feature maps and generate a prediction
sequence while completely avoiding the difficult segmentation problem.
Furthermore, an implicit language model is developed to make predictions based
on semantic context within a predicting feature sequence, providing a new
perspective for incorporating lexicon constraints and prior knowledge about a
certain language in the recognition procedure. Experiments on two standard
benchmarks, Dataset-CASIA and Dataset-ICDAR, yielded outstanding results, with
correct rates of 97.10% and 97.15%, respectively, which are significantly
better than the best result reported thus far in the literature.Comment: 14 pages, 9 figure