Skip to main content
Article thumbnail
Location of Repository

Singular or plural? Exploiting parallel corpora for Chinese number prediction

By Elizabeth Baran and Nianwen Xue

Abstract

We explore a novel approach to automatically predict noun number in Chinese by using a word-aligned Chinese-English parallel corpus. We first map number information from English onto Chinese to create a dataset labeled with a POS tagset enhanced with number information, and then train a model to automatically predict noun number using a combination of lexical and syntactic features. We evaluate the quality of the automatically mapped data and show the mapping is largely adequate despite a small percentage of errors. Trained on a relatively small data set, our model achieves a 4 % improvement in absolute accuracy over a majority baseline that considers all nouns to be singular.

Year: 2011
OAI identifier: oai:CiteSeerX.psu:10.1.1.353.1670
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.cs.brandeis.edu//~x... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.