Search CORE

1 research outputs found

Word Alignment Annotation in a Japanese-Chinese Parallel Corpus

Author: Hitoshi Isahara
Kiyotaka Uchimoto
Qing Ma
Yujie Zhang
Zhulong Wang
Publication venue
Publication date: 22/06/2010
Field of study

Parallel corpora are critical resources for machine translation research and development since parallel corpora contain translation equivalences of various granularities. Manual annotation of word alignment is of significance to provide gold-standard for developing and evaluating both example-based machine translation model and statistical machine translation model. This paper presents the work of word alignment annotation in the NICT Japanese-Chinese parallel corpus, which is constructed at the National Institute of Information and Communications Technology (NICT). We describe the specification of word alignment annotation and the tools specially developed for the manual annotation. The manual annotation on 17,000 sentence pairs has been completed. We examined the manually annotated word alignment data and extracted translation knowledge from the word aligned corpus. 1

CiteSeerX