Towards using web-crawled data for domain adaptation in statistical machine translation

Giagkou, Maria; Papavassiliou, Vassilis; Pecina, Pavel; Prokopidis, Prokopis; Toral, Antonio; Way, Andy

research

Towards using web-crawled data for domain adaptation in statistical machine translation

Authors: Maria Giagkou
Vassilis Papavassiliou
Pavel Pecina
Prokopis Prokopidis
Antonio Toral
Andy Way
Publication date: 30 May 2011
Publisher

Abstract

This paper reports on the ongoing work focused on domain adaptation of statistical machine translation using domain-speciﬁc data obtained by domain-focused web crawling. We present a strategy for crawling monolingual and parallel data and their exploitation for testing, language modelling, and system tuning in a phrase--based machine translation framework. The proposed approach is evaluated on the domains of Natural Environment and Labour Legislation and two language pairs: English–French and English–Greek

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

DCU Online Research Access Service

oai:doras.dcu.ie:16468

Last time updated on 10/07/2013