120 research outputs found
An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation
Training of neural machine translation (NMT) models usually uses mini-batches
for efficiency purposes. During the mini-batched training process, it is
necessary to pad shorter sentences in a mini-batch to be equal in length to the
longest sentence therein for efficient computation. Previous work has noted
that sorting the corpus based on the sentence length before making mini-batches
reduces the amount of padding and increases the processing speed. However,
despite the fact that mini-batch creation is an essential step in NMT training,
widely used NMT toolkits implement disparate strategies for doing so, which
have not been empirically validated or compared. This work investigates
mini-batch creation strategies with experiments over two different datasets.
Our results suggest that the choice of a mini-batch creation strategy has a
large effect on NMT training and some length-based sorting strategies do not
always work well compared with simple shuffling.Comment: 8 pages, accepted to the First Workshop on Neural Machine Translatio
- …