A memory-based classification approach to marker-based EBMT

Stroppa, Nicolas; van den Bosch, Antal; Way, Andy

research

A memory-based classification approach to marker-based EBMT

Authors: Nicolas Stroppa
Antal van den Bosch
Andy Way
Publication date: 1 January 2007
Publisher

Abstract

We describe a novel approach to example-based machine translation that makes use of marker-based chunks, in which the decoder is a memory-based classifier. The classifier is trained to map trigrams of source-language chunks onto trigrams of target-language chunks; then, in a second decoding step, the predicted trigrams are rearranged according to their overlap. We present the first results of this method on a Dutch-to-English translation system using Europarl data. Sparseness of the class space causes the results to lag behind a baseline phrase-based SMT system. In a further comparison, we also apply the method to a word-aligned version of the same data, and report a smaller difference with a word-based SMT system. We explore the scaling abilities of the memory-based approach, and observe linear scaling behavior in training and classification speed and memory costs, and loglinear BLEU improvements in the amount of training examples

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Name not available

oai:doras.dcu.ie:15267

Last time updated on 09/02/2018

Irish Universities

Last time updated on 30/12/2017

DCU Online Research Access Service

oai:doras.dcu.ie:15267

Last time updated on 10/07/2013