15 research outputs found
Sequence to Sequence Mixture Model for Diverse Machine Translation
Sequence to sequence (SEQ2SEQ) models often lack diversity in their generated
translations. This can be attributed to the limitation of SEQ2SEQ models in
capturing lexical and syntactic variations in a parallel corpus resulting from
different styles, genres, topics, or ambiguity of the translation process. In
this paper, we develop a novel sequence to sequence mixture (S2SMIX) model that
improves both translation diversity and quality by adopting a committee of
specialized translation models rather than a single translation model. Each
mixture component selects its own training dataset via optimization of the
marginal loglikelihood, which leads to a soft clustering of the parallel
corpus. Experiments on four language pairs demonstrate the superiority of our
mixture model compared to a SEQ2SEQ baseline with standard or diversity-boosted
beam search. Our mixture model uses negligible additional parameters and incurs
no extra computation cost during decoding.Comment: 11 pages, 5 figures, accepted to CoNLL201
Generating Diverse Translation by Manipulating Multi-Head Attention
Transformer model has been widely used on machine translation tasks and
obtained state-of-the-art results. In this paper, we report an interesting
phenomenon in its encoder-decoder multi-head attention: different attention
heads of the final decoder layer align to different word translation
candidates. We empirically verify this discovery and propose a method to
generate diverse translations by manipulating heads. Furthermore, we make use
of these diverse translations with the back-translation technique for better
data augmentation. Experiment results show that our method generates diverse
translations without severe drop in translation quality. Experiments also show
that back-translation with these diverse translations could bring significant
improvement on performance on translation tasks. An auxiliary experiment of
conversation response generation task proves the effect of diversity as well.Comment: Accepted by AAAI 202