Search CORE

149 research outputs found

Discourse Structure in Machine Translation Evaluation

Author: Guzmán Francisco
Joty Shafiq
Màrquez Lluís
Nakov Preslav
Publication venue
Publication date: 01/01/2017
Field of study

In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment- and at the system-level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DiscoTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the RST parse trees for machine translation evaluation. In particular we show that: (i) all aspects of the RST tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation RST tree to the reference tree is positively correlated with translation quality.Comment: machine translation, machine translation evaluation, discourse analysis. Computational Linguistics, 201

arXiv.org e-Print Archive

Directory of Open Access Journals

DR-NTU (Digital Repository of NTU)

Permutation forests for modeling word order in machine translation

Author: Stanojević M.
Publication venue
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

SemEval-2017 Task 1: semantic textual similarity - multilingual and cross-lingual focused evaluation

Author: Agirre E
Cer D
Diab M
Lopez-Gazpio I
Specia L
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Semantic Textual Similarity (STS) measures the meaning similarity of sentences. Applications include machine translation (MT), summarization, generation, question answering (QA), short answer grading, semantic search, dialog and conversational systems. The STS shared task is a venue for assessing the current state-of-the-art. The 2017 task focuses on multilingual and cross-lingual pairs with one sub-track exploring MT quality estimation (MTQE) data. The task obtained strong participation from 31 teams, with 17 participating in all language tracks. We summarize performance and review a selection of well performing methods. Analysis highlights common errors, providing insight into the limitations of existing models. To support ongoing work on semantic representations, the STS Benchmark is introduced as a new shared training and evaluation set carefully selected from the corpus of English STS shared task data (2012-2017)

arXiv.org e-Print Archive

Crossref

HAL Descartes

Spiral - Imperial College Digital Repository

Hal-Diderot

The QT21/HimL Combined Machine Translation System

Author: Alkhouli Tamer
Allauzen Alexandre
Aufrant Lauriane
Blain Frédéric
Bojar Ondrej
Braune Fabienne
Burlot Franck
Frank Stella
Fraser Alexander
Haddow Barry
Huck Matthias
knyazeva elena
Lavergne Thomas
Ney Hermann
Niehues Jan
Peter Jan-Thorsten
Pinnis Marcis
Sennrich Rico
Specia Lucia
Tamchyna Ales
Waibel Alex
Yvon François
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

This paper describes the joint submission of the QT21 and HimL projects for the English→Romanian translation task of the ACL 2016 First Conference on Machine Translation (WMT 2016). The submission is a system combination which combines twelve different statistical machine translation systems provided by the different groups (RWTH Aachen University, LMU Munich, Charles University in Prague, University of Edinburgh, University of Sheffield, Karlsruhe Institute of Technology, LIMSI, University of Amsterdam, Tilde). The systems are combined using RWTH’s system combination approach. The final submission shows an improvement of 1.0 BLEU compared to the best single system on newstest2016

Edinburgh Research Explorer

Publikationsserver der RWTH Aachen University

Biblio at Institute of Formal and Applied Linguistics

Wolverhampton Intellectual Repository and E-theses

English-to-Czech MT: Large Data and Beyond

Author: Bojar Ondřej
Publication venue
Publication date: 06/12/2018
Field of study

CU Digital Repository

Machine Translation: Phrase-Based, Rule-Based and Neural Approaches with Linguistic Evaluation

Author: Avramidis Eleftherios
Burchardt Aljoscha
Helcl Jindrich
Macketanz Vivien
Srivastava Ankit
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 26/06/2017
Field of study

Edinburgh Research Explorer

Linguistic Structure in Statistical Machine Translation

Author: Herrmann Teresa
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2015
Field of study

This thesis investigates the influence of linguistic structure in statistical machine translation. We develop a word reordering model based on syntactic parse trees and address the issues of pronouns and morphological agreement with a source discriminative word lexicon predicting the translation for individual words using structural features. When used in phrase-based machine translation, the models improve the translation for language pairs with different word order and morphological variation

KITopen