26 research outputs found
Robust Beam Search for Encoder-Decoder Attention Based Speech Recognition without Length Bias
As one popular modeling approach for end-to-end speech recognition,
attention-based encoder-decoder models are known to suffer the length bias and
corresponding beam problem. Different approaches have been applied in simple
beam search to ease the problem, most of which are heuristic-based and require
considerable tuning. We show that heuristics are not proper modeling
refinement, which results in severe performance degradation with largely
increased beam sizes. We propose a novel beam search derived from
reinterpreting the sequence posterior with an explicit length modeling. By
applying the reinterpreted probability together with beam pruning, the obtained
final probability leads to a robust model modification, which allows reliable
comparison among output sequences of different lengths. Experimental
verification on the LibriSpeech corpus shows that the proposed approach solves
the length bias problem without heuristics or additional tuning effort. It
provides robust decision making and consistently good performance under both
small and very large beam sizes. Compared with the best results of the
heuristic baseline, the proposed approach achieves the same WER on the 'clean'
sets and 4% relative improvement on the 'other' sets. We also show that it is
more efficient with the additional derived early stopping criterion.Comment: accepted at INTERSPEECH202
Unsupervised Natural Question Answering with a Small Model
The recent (2019-02) demonstration of the power of huge language models such
as GPT-2 to memorise the answers to factoid questions raises questions about
the extent to which knowledge is being embedded directly within these large
models. This short paper describes an architecture through which much smaller
models can also answer such questions - by making use of 'raw' external
knowledge. The contribution of this work is that the methods presented here
rely on unsupervised learning techniques, complementing the unsupervised
training of the Language Model. The goal of this line of research is to be able
to add knowledge explicitly, without extensive training.Comment: Accepted paper for FEVER workshop at EMNLP-IJCNLP 2019. (4 pages +
references
Masked Language Model Scoring
Pretrained masked language models (MLMs) require finetuning for most NLP
tasks. Instead, we evaluate MLMs out of the box via their pseudo-log-likelihood
scores (PLLs), which are computed by masking tokens one by one. We show that
PLLs outperform scores from autoregressive language models like GPT-2 in a
variety of tasks. By rescoring ASR and NMT hypotheses, RoBERTa reduces an
end-to-end LibriSpeech model's WER by 30% relative and adds up to +1.7 BLEU on
state-of-the-art baselines for low-resource translation pairs, with further
gains from domain adaptation. We attribute this success to PLL's unsupervised
expression of linguistic acceptability without a left-to-right bias, greatly
improving on scores from GPT-2 (+10 points on island effects, NPI licensing in
BLiMP). One can finetune MLMs to give scores without masking, enabling
computation in a single inference pass. In all, PLLs and their associated
pseudo-perplexities (PPPLs) enable plug-and-play use of the growing number of
pretrained MLMs; e.g., we use a single cross-lingual model to rescore
translations in multiple languages. We release our library for language model
scoring at https://github.com/awslabs/mlm-scoring.Comment: ACL 2020 camera-ready (presented July 2020
Neural Machine Translation For Low Resource Languages
Neural Machine translation is a challenging task due to the inherent complex
nature and the fluidity that natural languages bring. Nonetheless, in recent
years, it has achieved state-of-the-art performance in several language pairs.
Although, a lot of traction can be seen in the areas of multilingual neural
machine translation (MNMT) in the recent years, there are no comprehensive
survey done to identify what approaches work well. The goal of this paper is to
investigate the realm of low resource languages and build a Neural Machine
Translation model to achieve state-of-the-art results. The paper looks to build
upon the mBART language model and explore strategies to augment it with various
NLP and Deep Learning techniques like back translation and transfer learning.
This implementation tries to unpack the architecture of the NMT application and
determine the different components which offers us opportunities to amend the
said application within the purview of the low resource languages problem
space