Search CORE

32,487 research outputs found

Using F-structures in machine translation evaluation

Author: Graham Yvette
Owczarzak Karolina
van Genabith Josef
Way Andy
Publication venue: CSLI Publications
Publication date: 01/01/2007
Field of study

Despite a growing interest in automatic evaluation methods for Machine Translation (MT) quality, most existing automatic metrics are still limited to surface comparison of translation and reference strings. In this paper we show how Lexical-Functional Grammar (LFG) labelled dependencies obtained from an automatic parse can be used to assess the quality of MT on a deeper linguistic level, giving as a result higher correlations with human judgements

DCU Online Research Access Service

A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena

Author: Bisazza Arianna
Federico Marcello
Publication venue: 'MIT Press - Journals'
Publication date: 14/03/2016
Field of study

Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs. Instead, the choice of the optimal approach for a new translation task still seems to be mostly driven by empirical trials. To orientate the reader in this vast and complex research area, we present a comprehensive survey of word reordering viewed as a statistical modeling challenge and as a natural language phenomenon. The survey describes in detail how word reordering is modeled within different string-based and tree-based SMT frameworks and as a stand-alone task, including systematic overviews of the literature in advanced reordering modeling. We then question why some approaches are more successful than others in different language pairs. We argue that, besides measuring the amount of reordering, it is important to understand which kinds of reordering occur in a given language pair. To this end, we conduct a qualitative analysis of word reordering phenomena in a diverse sample of language pairs, based on a large collection of linguistic knowledge. Empirical results in the SMT literature are shown to support the hypothesis that a few linguistic facts can be very useful to anticipate the reordering characteristics of a language pair and to select the SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

UvA-DARE

International Migration, Integration and Social Cohesion online publications

A syntactic skeleton for statistical machine translation

Author: Groves Declan
Mellebeek Bart
Owczarzak Karolina
van Genabith Josef
Way Andy
Publication venue
Publication date: 01/01/2006
Field of study

We present a method for improving statistical machine translation performance by using linguistically motivated syntactic information. Our algorithm recursively decomposes source language sentences into syntactically simpler and shorter chunks, and recomposes their translation to form target language sentences. This improves both the word order and lexical selection of the translation. We report statistically significant relative improvementsof 3.3% BLEU score in an experiment (English!Spanish) carried out on an 800-sentence test set extracted from the Europarl corpus

CiteSeerX

Irish Universities

DCU Online Research Access Service

TransBooster: boosting the performance of wide-coverage machine translation systems

Author: Khasin Anna
Mellebeek Bart
van Genabith Josef
Way Andy
Publication venue
Publication date: 01/01/2005
Field of study

We propose the design, implementation and evaluation of a novel and modular approach to boost the translation performance of existing, wide-coverage, freely available machine translation systems based on reliable and fast automatic decomposition of the translation input and corresponding composition of translation output. We provide details of our method, and experimental results compared to the MT systems SYSTRAN and Logomedia. While many avenues for further experimentation remain, to date we fall just behind the baseline systems on the full 800-sentence testset, but in certain cases our method causes the translation quality obtained via the MT systems to improve

DCU Online Research Access Service

Disambiguation strategies for data-oriented translation

Author: Hearne Mary
Way Andy
Publication venue
Publication date: 01/01/2006
Field of study

The Data-Oriented Translation (DOT) model { originally proposed in (Poutsma, 1998, 2003) and based on Data-Oriented Parsing (DOP) (e.g. (Bod, Scha, & Sima'an, 2003)) { is best described as a hybrid model of translation as it combines examples, linguistic information and a statistical translation model. Although theoretically interesting, it inherits the computational complexity associated with DOP. In this paper, we focus on one computational challenge for this model: efficiently selecting the `best' translation to output. We present four different disambiguation strategies in terms of how they are implemented in our DOT system, along with experiments which investigate how they compare in terms of accuracy and efficiency

CiteSeerX

Irish Universities

DCU Online Research Access Service

Software Usability:A Comparison Between Two Tree-Structured Data Transformation Languages

Author: Sas Corina
Schmidt N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

This paper presents the results of a software usability study, involving both subjective and objective evaluation. It compares a popular XML data transformation language (XSLT) and a general purpose rule-based tree manipulation language which addresses some of the XML and XSLT limitations. The benefits of the evaluation study are discussed

Lancaster E-Prints