2 research outputs found
Chinese Spelling Error Detection Using a Fusion Lattice LSTM
Spelling error detection serves as a crucial preprocessing in many natural
language processing applications. Due to the characteristics of Chinese
Language, Chinese spelling error detection is more challenging than error
detection in English. Existing methods are mainly under a pipeline framework,
which artificially divides error detection process into two steps. Thus, these
methods bring error propagation and cannot always work well due to the
complexity of the language environment. Besides existing methods only adopt
character or word information, and ignore the positive effect of fusing
character, word, pinyin1 information together. We propose an LF-LSTM-CRF model,
which is an extension of the LSTMCRF with word lattices and
character-pinyin-fusion inputs. Our model takes advantage of the end-to-end
framework to detect errors as a whole process, and dynamically integrates
character, word and pinyin information. Experiments on the SIGHAN data show
that our LF-LSTM-CRF outperforms existing methods with similar external
resources consistently, and confirm the feasibility of adopting the end-to-end
framework and the availability of integrating of character, word and pinyin
information.Comment: 8 pages,5 figure
Description of HLJU Chinese Spelling Checker for SIGHAN Bakeoff 2013
In this paper, we describe in brief our system for Chinese Spelling Check Backoff sponsored by ACL-SIGHAN. It consists of three main components, namely potential incorrect character detection with a multiple-level analysis, correction candidate generation with similar character sets and correction scoring with n-grams. We participated in all the two sub-tasks at the Bakeoff. We also make a summary of this work and give some analysis on the results.