Skip to main content
Article thumbnail
Location of Repository

Alignment-Guided Chunking

By Yanjun Ma and Andy Way

Abstract

We introduce an adaptable monolingual chunking approach–Alignment-Guided Chunking (AGC)–which makes use of knowledge of word alignments acquired from bilingual corpora. Our approach is motivated by the observation that a sentence should be chunked differently depending the foreseen end-tasks. For example, given the different requirements of translation into (say) French and German, it is inappropriate to chunk up an English string in exactly the same way as preparation for translation into one or other of these languages. We test our chunking approach on two language pairs: French– English and German–English, where these two bilingual corpora share the same English sentences. Two chunkers trained on French–English (FE-Chunker) and German–English (DE-Chunker) respectively are used to perform chunking on the same English sentences. We construct two test sets, each suitable for French– English and German–English respectively. The performance of the two chunkers is evaluated on the appropriate test set and with one reference translation only, we report Fscores of 32.63 % for the FE-Chunker and 40.41 % for the DE-Chunker.

Year: 2009
OAI identifier: oai:CiteSeerX.psu:10.1.1.134.3576
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.computing.dcu.ie/~a... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.