13,093 research outputs found
Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation
Pairwise ranking methods are the basis of many widely used discriminative
training approaches for structure prediction problems in natural language
processing(NLP). Decomposing the problem of ranking hypotheses into pairwise
comparisons enables simple and efficient solutions. However, neglecting the
global ordering of the hypothesis list may hinder learning. We propose a
listwise learning framework for structure prediction problems such as machine
translation. Our framework directly models the entire translation list's
ordering to learn parameters which may better fit the given listwise samples.
Furthermore, we propose top-rank enhanced loss functions, which are more
sensitive to ranking errors at higher positions. Experiments on a large-scale
Chinese-English translation task show that both our listwise learning framework
and top-rank enhanced listwise losses lead to significant improvements in
translation quality.Comment: Accepted to CONLL 201
Tuning syntactically enhanced word alignment for statistical machine translation
We introduce a syntactically enhanced word alignment model that is more flexible than state-of-the-art generative word
alignment models and can be tuned according to different end tasks. First of all, this model takes the advantages of
both unsupervised and supervised word alignment approaches by obtaining anchor alignments from unsupervised generative
models and seeding the anchor alignments into a supervised discriminative model. Second, this model offers the flexibility of tuning the alignment according to different
optimisation criteria. Our experiments show that using our word alignment in a Phrase-Based Statistical Machine Translation system yields a 5.38% relative increase
on IWSLT 2007 task in terms of BLEU score
Robust Tuning Datasets for Statistical Machine Translation
We explore the idea of automatically crafting a tuning dataset for
Statistical Machine Translation (SMT) that makes the hyper-parameters of the
SMT system more robust with respect to some specific deficiencies of the
parameter tuning algorithms. This is an under-explored research direction,
which can allow better parameter tuning. In this paper, we achieve this goal by
selecting a subset of the available sentence pairs, which are more suitable for
specific combinations of optimizers, objective functions, and evaluation
measures. We demonstrate the potential of the idea with the pairwise ranking
optimization (PRO) optimizer, which is known to yield too short translations.
We show that the learning problem can be alleviated by tuning on a subset of
the development set, selected based on sentence length. In particular, using
the longest 50% of the tuning sentences, we achieve two-fold tuning speedup,
and improvements in BLEU score that rival those of alternatives, which fix
BLEU+1's smoothing instead.Comment: RANLP-201
Adapting Sequence Models for Sentence Correction
In a controlled experiment of sequence-to-sequence approaches for the task of
sentence correction, we find that character-based models are generally more
effective than word-based models and models that encode subword information via
convolutions, and that modeling the output data as a series of diffs improves
effectiveness over standard approaches. Our strongest sequence-to-sequence
model improves over our strongest phrase-based statistical machine translation
model, with access to the same data, by 6 M2 (0.5 GLEU) points. Additionally,
in the data environment of the standard CoNLL-2014 setup, we demonstrate that
modeling (and tuning against) diffs yields similar or better M2 scores with
simpler models and/or significantly less data than previous
sequence-to-sequence approaches.Comment: EMNLP 201
Accuracy-based scoring for DOT: towards direct error minimization for data-oriented translation
In this work we present a novel technique to rescore fragments in the Data-Oriented Translation model based on their contribution to translation accuracy. We describe
three new rescoring methods, and present the initial results of a pilot experiment on a small subset of the Europarl corpus. This work is a proof-of-concept, and
is the first step in directly optimizing translation
decisions solely on the hypothesized accuracy of potential translations resulting from those decisions
Target-Side Context for Discriminative Models in Statistical Machine Translation
Discriminative translation models utilizing source context have been shown to
help statistical machine translation performance. We propose a novel extension
of this work using target context information. Surprisingly, we show that this
model can be efficiently integrated directly in the decoding process. Our
approach scales to large training data sizes and results in consistent
improvements in translation quality on four language pairs. We also provide an
analysis comparing the strengths of the baseline source-context model with our
extended source-context and target-context model and we show that our extension
allows us to better capture morphological coherence. Our work is freely
available as part of Moses.Comment: Accepted as a long paper for ACL 201
- …