Skip to main content
Article thumbnail
Location of Repository

A MORPHEME-BASED LEXICAL CHUNKING SYSTEM FOR CHINESE

By Guo-hong Fu, Chun-yu Kit and Jonathan J. Webster

Abstract

Chinese lexical analysis consists of word segmentation and part-of-speech tagging. Most previous studies consider them as two separate tasks. In this paper we formalize the two processes as a unique chunking task on a sequence of morphemes and present an integrated lexical analysis system for Chinese based on lexicalized hidden Markov models. In this way, both contextual lexical information and word-internal morphological features can be statistically explored and further combined for disambiguation and unknown word resolution. Experimental results show that the proposed system outperforms several baselines, illustrating the benefits of the unified lexical chunking method with morphemes as the basic units. Keywords: Chinese lexical analysis; Lexical chunking; Wor

Topics: segmentation, Part-of-speech
Year: 2013
OAI identifier: oai:CiteSeerX.psu:10.1.1.352.6624
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.