38,510 research outputs found
Optimizing the Learning Order of Chinese Characters Using a Novel Topological Sort Algorithm
We present a novel algorithm for optimizing the order in which Chinese
characters are learned, one that incorporates the benefits of learning them in
order of usage frequency and in order of their hierarchal structural
relationships. We show that our work outperforms previously published orders
and algorithms. Our algorithm is applicable to any scheduling task where nodes
have intrinsic differences in importance and must be visited in topological
order
Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition
Online handwritten Chinese text recognition (OHCTR) is a challenging problem
as it involves a large-scale character set, ambiguous segmentation, and
variable-length input sequences. In this paper, we exploit the outstanding
capability of path signature to translate online pen-tip trajectories into
informative signature feature maps using a sliding window-based method,
successfully capturing the analytic and geometric properties of pen strokes
with strong local invariance and robustness. A multi-spatial-context fully
convolutional recurrent network (MCFCRN) is proposed to exploit the multiple
spatial contexts from the signature feature maps and generate a prediction
sequence while completely avoiding the difficult segmentation problem.
Furthermore, an implicit language model is developed to make predictions based
on semantic context within a predicting feature sequence, providing a new
perspective for incorporating lexicon constraints and prior knowledge about a
certain language in the recognition procedure. Experiments on two standard
benchmarks, Dataset-CASIA and Dataset-ICDAR, yielded outstanding results, with
correct rates of 97.10% and 97.15%, respectively, which are significantly
better than the best result reported thus far in the literature.Comment: 14 pages, 9 figure
Fast and Accurate Neural Word Segmentation for Chinese
Neural models with minimal feature engineering have achieved competitive
performance against traditional methods for the task of Chinese word
segmentation. However, both training and working procedures of the current
neural models are computationally inefficient. This paper presents a greedy
neural word segmenter with balanced word and character embedding inputs to
alleviate the existing drawbacks. Our segmenter is truly end-to-end, capable of
performing segmentation much faster and even more accurate than
state-of-the-art neural models on Chinese benchmark datasets.Comment: To appear in ACL201
- …