2 research outputs found
Parsimonious HMMs for Offline Handwritten Chinese Text Recognition
Recently, hidden Markov models (HMMs) have achieved promising results for
offline handwritten Chinese text recognition. However, due to the large
vocabulary of Chinese characters with each modeled by a uniform and fixed
number of hidden states, a high demand of memory and computation is required.
In this study, to address this issue, we present parsimonious HMMs via the
state tying which can fully utilize the similarities among different Chinese
characters. Two-step algorithm with the data-driven question-set is adopted to
generate the tied-state pool using the likelihood measure. The proposed
parsimonious HMMs with both Gaussian mixture models (GMMs) and deep neural
networks (DNNs) as the emission distributions not only lead to a compact model
but also improve the recognition accuracy via the data sharing for the tied
states and the confusion decreasing among state classes. Tested on ICDAR-2013
competition database, in the best configured case, the new parsimonious DNN-HMM
can yield a relative character error rate (CER) reduction of 6.2%, 25%
reduction of model size and 60% reduction of decoding time over the
conventional DNN-HMM. In the compact setting case of average 1-state HMM, our
parsimonious DNN-HMM significantly outperforms the conventional DNN-HMM with a
relative CER reduction of 35.5%.Comment: Accepted by ICFHR201
Aggregation Cross-Entropy for Sequence Recognition
In this paper, we propose a novel method, aggregation cross-entropy (ACE),
for sequence recognition from a brand new perspective. The ACE loss function
exhibits competitive performance to CTC and the attention mechanism, with much
quicker implementation (as it involves only four fundamental formulas), faster
inference\back-propagation (approximately O(1) in parallel), less storage
requirement (no parameter and negligible runtime memory), and convenient
employment (by replacing CTC with ACE). Furthermore, the proposed ACE loss
function exhibits two noteworthy properties: (1) it can be directly applied for
2D prediction by flattening the 2D prediction into 1D prediction as the input
and (2) it requires only characters and their numbers in the sequence
annotation for supervision, which allows it to advance beyond sequence
recognition, e.g., counting problem. The code is publicly available at
https://github.com/summerlvsong/Aggregation-Cross-Entropy.Comment: 10 pages, 6 figures, Accepted by CVPR201