290 research outputs found
Stronger Baselines for Trustable Results in Neural Machine Translation
Interest in neural machine translation has grown rapidly as its effectiveness
has been demonstrated across language and data scenarios. New research
regularly introduces architectural and algorithmic improvements that lead to
significant gains over "vanilla" NMT implementations. However, these new
techniques are rarely evaluated in the context of previously published
techniques, specifically those that are widely used in state-of-theart
production and shared-task systems. As a result, it is often difficult to
determine whether improvements from research will carry over to systems
deployed for real-world use. In this work, we recommend three specific methods
that are relatively easy to implement and result in much stronger experimental
systems. Beyond reporting significantly higher BLEU scores, we conduct an
in-depth analysis of where improvements originate and what inherent weaknesses
of basic NMT models are being addressed. We then compare the relative gains
afforded by several other techniques proposed in the literature when starting
with vanilla systems versus our stronger baselines, showing that experimental
conclusions may change depending on the baseline chosen. This indicates that
choosing a strong baseline is crucial for reporting reliable experimental
results.Comment: To appear at the Workshop on Neural Machine Translation (WNMT
A Syntactic Neural Model for General-Purpose Code Generation
We consider the problem of parsing natural language descriptions into source
code written in a general-purpose programming language like Python. Existing
data-driven methods treat this problem as a language generation task without
considering the underlying syntax of the target programming language. Informed
by previous work in semantic parsing, in this paper we propose a novel neural
architecture powered by a grammar model to explicitly capture the target syntax
as prior knowledge. Experiments find this an effective way to scale up to
generation of complex programs from natural language descriptions, achieving
state-of-the-art results that well outperform previous code generation and
semantic parsing approaches.Comment: To appear in ACL 201
Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction
Labeled sequence transduction is a task of transforming one sequence into
another sequence that satisfies desiderata specified by a set of labels. In
this paper we propose multi-space variational encoder-decoders, a new model for
labeled sequence transduction with semi-supervised learning. The generative
model can use neural networks to handle both discrete and continuous latent
variables to exploit various features of data. Experiments show that our model
provides not only a powerful supervised framework but also can effectively take
advantage of the unlabeled data. On the SIGMORPHON morphological inflection
benchmark, our model outperforms single-model state-of-art results by a large
margin for the majority of languages.Comment: Accepted by ACL 201
Handling Homographs in Neural Machine Translation
Homographs, words with different meanings but the same surface form, have
long caused difficulty for machine translation systems, as it is difficult to
select the correct translation based on the context. However, with the advent
of neural machine translation (NMT) systems, which can theoretically take into
account global sentential context, one may hypothesize that this problem has
been alleviated. In this paper, we first provide empirical evidence that
existing NMT systems in fact still have significant problems in properly
translating ambiguous words. We then proceed to describe methods, inspired by
the word sense disambiguation literature, that model the context of the input
word with context-aware word embeddings that help to differentiate the word
sense be- fore feeding it into the encoder. Experiments on three language pairs
demonstrate that such models improve the performance of NMT systems both in
terms of BLEU score and in the accuracy of translating homographs.Comment: NAACL201
- …