8,537 research outputs found
Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input
Non-autoregressive translation (NAT) models, which remove the dependence on
previous target tokens from the inputs of the decoder, achieve significantly
inference speedup but at the cost of inferior accuracy compared to
autoregressive translation (AT) models. Previous work shows that the quality of
the inputs of the decoder is important and largely impacts the model accuracy.
In this paper, we propose two methods to enhance the decoder inputs so as to
improve NAT models. The first one directly leverages a phrase table generated
by conventional SMT approaches to translate source tokens to target tokens,
which are then fed into the decoder as inputs. The second one transforms
source-side word embeddings to target-side word embeddings through
sentence-level alignment and word-level adversary learning, and then feeds the
transformed word embeddings into the decoder as inputs. Experimental results
show our method largely outperforms the NAT baseline~\citep{gu2017non} by
BLEU scores on WMT14 English-German task and BLEU scores on WMT16
English-Romanian task.Comment: AAAI 201
Guiding Non-Autoregressive Neural Machine Translation Decoding with Reordering Information
Non-autoregressive neural machine translation (NAT) generates each target
word in parallel and has achieved promising inference acceleration. However,
existing NAT models still have a big gap in translation quality compared to
autoregressive neural machine translation models due to the enormous decoding
space. To address this problem, we propose a novel NAT framework named
ReorderNAT which explicitly models the reordering information in the decoding
procedure. We further introduce deterministic and non-deterministic decoding
strategies that utilize reordering information to narrow the decoding search
space in our proposed ReorderNAT. Experimental results on various widely-used
datasets show that our proposed model achieves better performance compared to
existing NAT models, and even achieves comparable translation quality as
autoregressive translation models with a significant speedup.Comment: Accepted by AAAI 202
End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
Autoregressive decoding is the only part of sequence-to-sequence models that
prevents them from massive parallelization at inference time.
Non-autoregressive models enable the decoder to generate all output symbols
independently in parallel. We present a novel non-autoregressive architecture
based on connectionist temporal classification and evaluate it on the task of
neural machine translation. Unlike other non-autoregressive methods which
operate in several steps, our model can be trained end-to-end. We conduct
experiments on the WMT English-Romanian and English-German datasets. Our models
achieve a significant speedup over the autoregressive models, keeping the
translation quality comparable to other non-autoregressive models.Comment: EMNLP 201
Non-autoregressive Machine Translation with Probabilistic Context-free Grammar
Non-autoregressive Transformer(NAT) significantly accelerates the inference
of neural machine translation. However, conventional NAT models suffer from
limited expression power and performance degradation compared to autoregressive
(AT) models due to the assumption of conditional independence among target
tokens. To address these limitations, we propose a novel approach called
PCFG-NAT, which leverages a specially designed Probabilistic Context-Free
Grammar (PCFG) to enhance the ability of NAT models to capture complex
dependencies among output tokens. Experimental results on major machine
translation benchmarks demonstrate that PCFG-NAT further narrows the gap in
translation quality between NAT and AT models. Moreover, PCFG-NAT facilitates a
deeper understanding of the generated sentences, addressing the lack of
satisfactory explainability in neural machine translation.Code is publicly
available at https://github.com/ictnlp/PCFG-NAT.Comment: NeurIPS 202
- …