11,590 research outputs found
Learning labelled dependencies in machine translation evaluation
Recently novel MT evaluation metrics have been presented which go beyond pure string matching, and which correlate
better than other existing metrics with human judgements. Other research in this area has presented machine learning
methods which learn directly from human judgements. In this paper, we present a novel combination of dependency- and
machine learning-based approaches to automatic MT evaluation, and demonstrate greater correlations with human judgement than the existing state-of-the-art methods.
In addition, we examine the extent to which our novel method can be generalised across different tasks and domains
Dependency relations as source context in phrase-based SMT
The Phrase-Based Statistical Machine Translation (PB-SMT) model has recently begun to include source context modeling, under the assumption that the proper lexical
choice of an ambiguous word can be determined from the context in which it appears. Various types of lexical and syntactic features such as words, parts-of-speech, and
supertags have been explored as effective source context in SMT. In this paper, we show that position-independent syntactic dependency relations of the head of a source phrase can be modeled as useful source context to improve target phrase selection and thereby improve overall performance of PB-SMT. On a Dutch—English translation task, by combining dependency relations and syntactic contextual features (part-of-speech), we achieved a 1.0 BLEU (Papineni et al., 2002) point improvement (3.1% relative) over the baseline
An open source rule induction tool for transfer-based SMT
In this paper we describe an open source tool for automatic induction of transfer rules. Transfer rule induction is carried out on pairs of dependency structures and their node alignment to produce all rules consistent with the node alignment. We describe an efficient algorithm for rule induction and give a detailed description of how to use the tool
Target-Side Context for Discriminative Models in Statistical Machine Translation
Discriminative translation models utilizing source context have been shown to
help statistical machine translation performance. We propose a novel extension
of this work using target context information. Surprisingly, we show that this
model can be efficiently integrated directly in the decoding process. Our
approach scales to large training data sizes and results in consistent
improvements in translation quality on four language pairs. We also provide an
analysis comparing the strengths of the baseline source-context model with our
extended source-context and target-context model and we show that our extension
allows us to better capture morphological coherence. Our work is freely
available as part of Moses.Comment: Accepted as a long paper for ACL 201
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
MATREX: the DCU MT system for WMT 2010
This paper describes the DCU machine translation system in the evaluation campaign of the Joint Fifth Workshop on Statistical Machine Translation and Metrics in ACL-2010. We describe the modular design of our multi-engine machine translation (MT) system with particular focus on the components used in this participation.
We participated in the English–Spanish and English–Czech translation tasks, in which we employed our multiengine
architecture to translate. We also participated in the system combination task which was carried out by the MBR
decoder and confusion network decoder
- …