Building a German/Simple German Parallel Corpus for Automatic Text Simplification

Ebling, S; Klaper, David; Volk, Martin

unknown

Building a German/Simple German Parallel Corpus for Automatic Text Simplification

Authors: S Ebling
David Klaper
Martin Volk
Publication date: 8 August 2013
Publisher
Doi

Abstract

In this paper we report our experiments in creating a parallel corpus using German/Simple German documents from the web. We require parallel data to build a statistical machine translation (SMT) system that translates from German into Simple German. Parallel data for SMT systems needs to be aligned at the sentence level. We applied an existing monolingual sentence alignment algorithm. We show the limits of the algorithm with respect to the language and domain of our data and suggest ways of circumventing them

Similar works

Full text

Available Versions

ZORA

oai:www.zora.uzh.ch:78610

Last time updated on 09/07/2013