Search CORE

12,793 research outputs found

More Linguistic Annotation for Statistical Machine Translation

Author: Haddow Barry
Hoang Hieu
Koehn Philipp
Williams Philip
Publication venue
Publication date: 01/01/2010
Field of study

Edinburgh Research Explorer

Dutch parallel corpus : a multilingual annotated corpus

Author: Desmet Piet
Macken Lieve
Paulussen Hans
Rura Lidia
Trushkina Julia
Vandeweghe Willy
Publication venue
Publication date: 01/01/2007
Field of study

Ghent University Academic Bibliography

Description of the Chinese-to-Spanish rule-based machine translation system developed with a hybrid combination of human annotation and statistical techniques

Author: Centelles Jordi
Ruiz Costa-Jussà Marta
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Two of the most popular Machine Translation (MT) paradigms are rule based (RBMT) and corpus based, which include the statistical systems (SMT). When scarce parallel corpus is available, RBMT becomes particularly attractive. This is the case of the Chinese--Spanish language pair. This article presents the first RBMT system for Chinese to Spanish. We describe a hybrid method for constructing this system taking advantage of available resources such as parallel corpora that are used to extract dictionaries and lexical and structural transfer rules. The final system is freely available online and open source. Although performance lags behind standard SMT systems for an in-domain test set, the results show that the RBMT’s coverage is competitive and it outperforms the SMT system in an out-of-domain test set. This RBMT system is available to the general public, it can be further enhanced, and it opens up the possibility of creating future hybrid MT systems.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Fine-grained human evaluation of neural versus phrase-based machine translation

Author: Klubička Filip
Sánchez-Cartagena Víctor M.
Toral Antonio
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2017
Field of study

We compare three approaches to statistical machine translation (pure phrase-based, factored phrase-based and neural) by performing a fine-grained manual evaluation via error annotation of the systems' outputs. The error types in our annotation are compliant with the multidimensional quality metrics (MQM), and the annotation is performed by two annotators. Inter-annotator agreement is high for such a task, and results show that the best performing system (neural) reduces the errors produced by the worst system (phrase-based) by 54%.Comment: 12 pages, 2 figures, The Prague Bulletin of Mathematical Linguistic

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Bootstrapping machine translation for the language pair English-Kiswahili

Author: De Pauw G
de Schryver Gilles-Maurice
Wagacha P
Publication venue
Publication date: 01/01/2008
Field of study

Ghent University Academic Bibliography

Lost in translation: the problems of using mainstream MT evaluation metrics for sign language translation

Author: Morrissey Sara
Way Andy
Publication venue
Publication date: 01/01/2006
Field of study

In this paper we consider the problems of applying corpus-based techniques to minority languages that are neither politically recognised nor have a formally accepted writing system, namely sign languages. We discuss the adoption of an annotated form of sign language data as a suitable corpus for the development of a data-driven machine translation (MT) system, and deal with issues that arise from its use. Useful software tools that facilitate easy annotation of video data are also discussed. Furthermore, we address the problems of using traditional MT evaluation metrics for sign language translation. Based on the candidate translations produced from our example-based machine translation system, we discuss why standard metrics fall short of providing an accurate evaluation and suggest more suitable evaluation methods

CiteSeerX

Irish Universities

DCU Online Research Access Service