3,751 research outputs found
Mask-Predict: Parallel Decoding of Conditional Masked Language Models
Most machine translation systems generate text autoregressively from left to
right. We, instead, use a masked language modeling objective to train a model
to predict any subset of the target words, conditioned on both the input text
and a partially masked target translation. This approach allows for efficient
iterative decoding, where we first predict all of the target words
non-autoregressively, and then repeatedly mask out and regenerate the subset of
words that the model is least confident about. By applying this strategy for a
constant number of iterations, our model improves state-of-the-art performance
levels for non-autoregressive and parallel decoding translation models by over
4 BLEU on average. It is also able to reach within about 1 BLEU point of a
typical left-to-right transformer model, while decoding significantly faster.Comment: EMNLP 201
PREMIER - PRobabilistic Error-correction using Markov Inference in Errored Reads
In this work we present a flexible, probabilistic and reference-free method
of error correction for high throughput DNA sequencing data. The key is to
exploit the high coverage of sequencing data and model short sequence outputs
as independent realizations of a Hidden Markov Model (HMM). We pose the problem
of error correction of reads as one of maximum likelihood sequence detection
over this HMM. While time and memory considerations rule out an implementation
of the optimal Baum-Welch algorithm (for parameter estimation) and the optimal
Viterbi algorithm (for error correction), we propose low-complexity approximate
versions of both. Specifically, we propose an approximate Viterbi and a
sequential decoding based algorithm for the error correction. Our results show
that when compared with Reptile, a state-of-the-art error correction method,
our methods consistently achieve superior performances on both simulated and
real data sets.Comment: Submitted to ISIT 201
- …