The TALP & I2R SMT Systems for IWSLT 2008

Aw, A.; Banchs Martínez, Rafael Enrique; Chen, B.; Henríquez Quintana, Carlos Alberto; Hernández, A.; Khalilov, Maxim; Li, H.; Mariño Acebal, José Bernardo; Rodríguez Fonollosa, José Adrián; Ruiz Costa-Jussà, Marta; Zhang, M.

unknown

The TALP & I2R SMT Systems for IWSLT 2008

Authors: A. Aw
Rafael Enrique Banchs Martínez
B. Chen
Carlos Alberto Henríquez Quintana
A. Hernández
Maxim Khalilov
H. Li
José Bernardo Mariño Acebal
José Adrián Rodríguez Fonollosa
Marta Ruiz Costa-Jussà
M. Zhang
Publication date: 31 October 2008
Publisher: NICT/ATR

Abstract

This paper gives a description of the statistical machine translation (SMT) systems developed at the TALP Research Center of the UPC (Universitat Polit`ecnica de Catalunya) for our participation in the IWSLT’08 evaluation campaign. We present Ngram-based (TALPtuples) and phrase-based (TALPphrases) SMT systems. The paper explains the 2008 systems’ architecture and outlines translation schemes we have used, mainly focusing on the new techniques that are challenged to improve speech-to-speech translation quality. The novelties we have introduced are: improved reordering method, linear combination of translation and reordering models and new technique dealing with punctuation marks insertion for a phrase-based SMT system. This year we focus on the Arabic-English, Chinese-Spanish and pivot Chinese-(English)-Spanish translation tasks.Postprint (published version

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

UPCommons. Portal del coneixement obert de la UPC

oai:upcommons.upc.edu:2117/111...

Last time updated on 16/06/2016

UPCommons

oai:upcommons.upc.edu:2117/111...

Last time updated on 17/04/2020