2,612 research outputs found
A Large-Scale Test Set for the Evaluation of Context-Aware Pronoun Translation in Neural Machine Translation
The translation of pronouns presents a special challenge to machine
translation to this day, since it often requires context outside the current
sentence. Recent work on models that have access to information across sentence
boundaries has seen only moderate improvements in terms of automatic evaluation
metrics such as BLEU. However, metrics that quantify the overall translation
quality are ill-equipped to measure gains from additional context. We argue
that a different kind of evaluation is needed to assess how well models
translate inter-sentential phenomena such as pronouns. This paper therefore
presents a test suite of contrastive translations focused specifically on the
translation of pronouns. Furthermore, we perform experiments with several
context-aware models. We show that, while gains in BLEU are moderate for those
systems, they outperform baselines by a large margin in terms of accuracy on
our contrastive test set. Our experiments also show the effectiveness of
parameter tying for multi-encoder architectures.Comment: Accepted at WMT 201
Selective Attention for Context-aware Neural Machine Translation
Despite the progress made in sentence-level NMT, current systems still fall
short at achieving fluent, good quality translation for a full document. Recent
works in context-aware NMT consider only a few previous sentences as context
and may not scale to entire documents. To this end, we propose a novel and
scalable top-down approach to hierarchical attention for context-aware NMT
which uses sparse attention to selectively focus on relevant sentences in the
document context and then attends to key words in those sentences. We also
propose single-level attention approaches based on sentence or word-level
information in the context. The document-level context representation, produced
from these attention modules, is integrated into the encoder or decoder of the
Transformer model depending on whether we use monolingual or bilingual context.
Our experiments and evaluation on English-German datasets in different document
MT settings show that our selective attention approach not only significantly
outperforms context-agnostic baselines but also surpasses context-aware
baselines in most cases.Comment: Accepted at NAACL-HLT 201
On context span needed for machine translation evaluation
Despite increasing efforts to improve evaluation of machine translation (MT) by going beyond the sentence level to the document level, the definition of what exactly constitutes a ``document level'' is still not clear. This work deals with the context span necessary for a more reliable MT evaluation. We report results from a series of surveys involving three domains and 18 target languages designed to identify the necessary context span as well as issues related to it. Our findings indicate that, despite the fact that some issues and spans are strongly dependent on domain and on the target language, a number of common patterns can be observed so that general guidelines for context-aware MT evaluation can be drawn
Modeling contextual information in neural machine translation
Machine translation has provided impressive translation quality for many language pairs. The improvements over the past few years are largely due to the introduction of neural networks to the field, resulting in the modern sequence-to-sequence neural machine translation models. NMT is at the core of many largescale industrial tools for automatic translation such as Google Translate, Microsoft Translator, Amazon Translate and many others.
Current NMT models work on the sentence-level, meaning they are used to translate individual sentences. However, for most practical use-cases, a user is interested in translating a document. In these cases, an MT tool splits a document into individual sentences and translates them independently. As a result, any dependencies between the sentences are ignored. This is likely to result in an incoherent document translation, mainly because of inconsistent translation of ambiguous source words or wrong translation of anaphoric pronouns. For example, it is undesirable to translate “bank” as a “financial bank” in one sentence and then later as a “river bank”. Furthermore, the translation of, e.g., the English third person pronoun “it” into German depends on the grammatical gender of the English antecedent’s German translation.
NMT has shown that it has impressive modeling capabilities, but is nevertheless unable to model discourse-level phenomena as it needs access to contextual information. In this work, we study discourse-level phenomena in context-aware NMT. To facilitate the particular studies of interest, we propose several models capable of incorporating contextual information into standard sentence-level NMT models. We direct our focus on several discourse phenomena, namely, coreference (anaphora) resolution, coherence and cohesion. We discuss these phenomena in terms of how well can they be modeled by context-aware NMT, how can we improve upon current state-of-the-art as well as the optimal granularity at which these phenomena should be modeled. We further investigate domain as a factor in context-aware NMT. Finally, we investigate existing challenge sets for anaphora resolution evaluation and provide a robust alternative.
We make the following contributions:
i) We study the importance of coreference (anaphora) resolution and coherence for context-aware NMT by making use of oracle information specific to these phenomena.
ii) We propose a method for improving performance on anaphora resolution based on curriculum learning which is inspired by the way humans organize learning.
iii) We investigate the use of contextual information for better handling of domain information, in particular in the case of modeling multiple domains at once and when applied to zero-resource domains.
iv) We present several context-aware models to enable us to examine the specific phenomena of interest we already mentioned.
v) We study the optimal way of modeling local and global context and present a model theoretically capable of using very large document context.
vi) We study the robustness of challenge sets for evaluation of anaphora resolution in MT by means of adversarial attacks and provide a template test set that robustly evaluates specific steps of an idealized coreference resolution pipeline for MT
Document-Level Language Models for Machine Translation
Despite the known limitations, most machine translation systems today still
operate on the sentence-level. One reason for this is, that most parallel
training data is only sentence-level aligned, without document-level meta
information available. In this work, we set out to build context-aware
translation systems utilizing document-level monolingual data instead. This can
be achieved by combining any existing sentence-level translation model with a
document-level language model. We improve existing approaches by leveraging
recent advancements in model combination. Additionally, we propose novel
weighting techniques that make the system combination more flexible and
significantly reduce computational overhead. In a comprehensive evaluation on
four diverse translation tasks, we show that our extensions improve
document-targeted scores substantially and are also computationally more
efficient. However, we also find that in most scenarios, back-translation gives
even better results, at the cost of having to re-train the translation system.
Finally, we explore language model fusion in the light of recent advancements
in large language models. Our findings suggest that there might be strong
potential in utilizing large language models via model combination.Comment: accepted at WMT 202
Improving Long Context Document-Level Machine Translation
Document-level context for neural machine translation (NMT) is crucial to
improve the translation consistency and cohesion, the translation of ambiguous
inputs, as well as several other linguistic phenomena. Many works have been
published on the topic of document-level NMT, but most restrict the system to
only local context, typically including just the one or two preceding sentences
as additional information. This might be enough to resolve some ambiguous
inputs, but it is probably not sufficient to capture some document-level
information like the topic or style of a conversation. When increasing the
context size beyond just the local context, there are two challenges: (i)
the~memory usage increases exponentially (ii) the translation performance
starts to degrade. We argue that the widely-used attention mechanism is
responsible for both issues. Therefore, we propose a constrained attention
variant that focuses the attention on the most relevant parts of the sequence,
while simultaneously reducing the memory consumption. For evaluation, we
utilize targeted test sets in combination with novel evaluation techniques to
analyze the translations in regards to specific discourse-related phenomena. We
find that our approach is a good compromise between sentence-level NMT vs
attending to the full context, especially in low resource scenarios.Comment: accepted at CODI 2023 (ACL workshop
- …