171 research outputs found
Neural Automatic Post-Editing Using Prior Alignment and Reranking
We present a second-stage machine translation (MT) system based on a neural machine translation (NMT) approach to automatic post-editing (APE) that improves the translation quality provided by a firststage MT system. Our APE system (AP ESym) is an extended version of an attention based NMT model with bilingual
symmetry employing bidirectional models, mt → pe and pe → mt. APE translations produced by our system show statistically significant improvements over the first-stage MT, phrase-based APE and the best reported score on the WMT 2016 APE dataset by a previous neural APE system. Re-ranking (AP ERerank) of the
n-best translations from the phrase-based APE and AP ESym systems provides further substantial improvements over the symmetric neural APE model. Human evaluation confirms that the AP ERerank
generated PE translations improve on the previous best neural APE system at WMT 2016.Santanu Pal is supported by the People Programme (Marie Curie Actions) of the European Union’s Framework Programme (FP7/2007-2013) under REA grant agreement no 317471. Sudip Kumar Naskar is supported by Media Lab Asia, MeitY, Government of India, under the Young Faculty Research Fellowship of the Visvesvaraya PhD Scheme for Electronics & IT.
Qun Liu and Josef van Genabith is supported by funding from the European Union Horizon 2020
research and innovation programme under grant agreement no 645452 (QT21)
LIG-CRIStAL System for the WMT17 Automatic Post-Editing Task
This paper presents the LIG-CRIStAL submission to the shared Automatic Post-
Editing task of WMT 2017. We propose two neural post-editing models: a
monosource model with a task-specific attention mechanism, which performs
particularly well in a low-resource scenario; and a chained architecture which
makes use of the source sentence to provide extra context. This latter
architecture manages to slightly improve our results when more training data is
available. We present and discuss our results on two datasets (en-de and de-en)
that are made available for the task.Comment: keywords: neural post-edition, attention model
An Exploration of Neural Sequence-to-Sequence Architectures for Automatic Post-Editing
In this work, we explore multiple neural architectures adapted for the task
of automatic post-editing of machine translation output. We focus on neural
end-to-end models that combine both inputs (raw MT output) and
(source language input) in a single neural architecture, modeling directly. Apart from that, we investigate the influence of
hard-attention models which seem to be well-suited for monolingual tasks, as
well as combinations of both ideas. We report results on data sets provided
during the WMT-2016 shared task on automatic post-editing and can demonstrate
that dual-attention models that incorporate all available data in the APE
scenario in a single model improve on the best shared task system and on all
other published results after the shared task. Dual-attention models that are
combined with hard attention remain competitive despite applying fewer changes
to the input.Comment: Accepted for presentation at IJCNLP 201
Preference Learning for Machine Translation
Automatic translation of natural language is still (as of 2017) a long-standing but unmet promise. While advancing at a fast rate, the underlying methods are still far from actually being able to reliably capture syntax or semantics of arbitrary utterances of natural language, way off transporting the encoded meaning into a second language. However, it is possible to build useful translating machines when the target domain is well known and the machine is able to learn and adapt efficiently and promptly from new inputs. This is possible thanks to efficient and effective machine learning methods which can be applied to automatic translation.
In this work we present and evaluate methods for three distinct scenarios:
a) We develop algorithms that can learn from very large amounts of data by exploiting pairwise preferences defined over competing translations, which can be used to make a machine translation system robust to arbitrary texts from varied sources, but also enable it to learn effectively to adapt to new domains of data;
b) We describe a method that is able to efficiently learn external models which adhere to fine-grained preferences that are extracted from a constricted selection of translated material, e.g. for adapting to users or groups of users in a computer-aided translation scenario;
c) We develop methods for two machine translation paradigms, neural- and traditional statistical machine translation, to directly adapt to user-defined preferences in an interactive post-editing scenario, learning precisely adapted machine translation systems.
In all of these settings, we show that machine translation can be made significantly more useful by careful optimization via preference learning
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Many recent advances in natural language generation have been fueled by
training large language models on internet-scale data. However, this paradigm
can lead to models that generate toxic, inaccurate, and unhelpful content, and
automatic evaluation metrics often fail to identify these behaviors. As models
become more capable, human feedback is an invaluable signal for evaluating and
improving models. This survey aims to provide an overview of the recent
research that has leveraged human feedback to improve natural language
generation. First, we introduce an encompassing formalization of feedback, and
identify and organize existing research into a taxonomy following this
formalization. Next, we discuss how feedback can be described by its format and
objective, and cover the two approaches proposed to use feedback (either for
training or decoding): directly using the feedback or training feedback models.
We also discuss existing datasets for human-feedback data collection, and
concerns surrounding feedback collection. Finally, we provide an overview of
the nascent field of AI feedback, which exploits large language models to make
judgments based on a set of principles and minimize the need for human
intervention.Comment: Work in Progres
LENS: A Learnable Evaluation Metric for Text Simplification
Training learnable metrics using modern language models has recently emerged
as a promising method for the automatic evaluation of machine translation.
However, existing human evaluation datasets for text simplification have
limited annotations that are based on unitary or outdated models, making them
unsuitable for this approach. To address these issues, we introduce the
SimpEval corpus that contains: SimpEval_past, comprising 12K human ratings on
2.4K simplifications of 24 past systems, and SimpEval_2022, a challenging
simplification benchmark consisting of over 1K human ratings of 360
simplifications including GPT-3.5 generated text. Training on SimpEval, we
present LENS, a Learnable Evaluation Metric for Text Simplification. Extensive
empirical results show that LENS correlates much better with human judgment
than existing metrics, paving the way for future progress in the evaluation of
text simplification. We also introduce Rank and Rate, a human evaluation
framework that rates simplifications from several models in a list-wise manner
using an interactive interface, which ensures both consistency and accuracy in
the evaluation process and is used to create the SimpEval datasets.Comment: Accepted at ACL 202
- …