79 research outputs found
Linguistic Input Features Improve Neural Machine Translation
Neural machine translation has recently achieved impressive results, while
using little in the way of external linguistic information. In this paper we
show that the strong learning capability of neural MT models does not make
linguistic features redundant; they can be easily incorporated to provide
further improvements in performance. We generalize the embedding layer of the
encoder in the attentional encoder--decoder architecture to support the
inclusion of arbitrary features, in addition to the baseline word feature. We
add morphological features, part-of-speech tags, and syntactic dependency
labels as input features to EnglishGerman, and English->Romanian neural
machine translation systems. In experiments on WMT16 training and test sets, we
find that linguistic input features improve model quality according to three
metrics: perplexity, BLEU and CHRF3. An open-source implementation of our
neural MT system is available, as are sample files and configurations.Comment: WMT16 final version; new EN-RO result
Generating High-Quality Surface Realizations Using Data Augmentation and Factored Sequence Models
This work presents a new state of the art in reconstruction of surface
realizations from obfuscated text. We identify the lack of sufficient training
data as the major obstacle to training high-performing models, and solve this
issue by generating large amounts of synthetic training data. We also propose
preprocessing techniques which make the structure contained in the input
features more accessible to sequence models. Our models were ranked first on
all evaluation metrics in the English portion of the 2018 Surface Realization
shared task
Neural Machine Translation by Generating Multiple Linguistic Factors
Factored neural machine translation (FNMT) is founded on the idea of using
the morphological and grammatical decomposition of the words (factors) at the
output side of the neural network. This architecture addresses two well-known
problems occurring in MT, namely the size of target language vocabulary and the
number of unknown tokens produced in the translation. FNMT system is designed
to manage larger vocabulary and reduce the training time (for systems with
equivalent target language vocabulary size). Moreover, we can produce
grammatically correct words that are not part of the vocabulary. FNMT model is
evaluated on IWSLT'15 English to French task and compared to the baseline
word-based and BPE-based NMT systems. Promising qualitative and quantitative
results (in terms of BLEU and METEOR) are reported.Comment: 11 pages, 3 figues, SLSP conferenc
Learning to Parse and Translate Improves Neural Machine Translation
There has been relatively little attention to incorporating linguistic prior
to neural machine translation. Much of the previous work was further
constrained to considering linguistic prior on the source side. In this paper,
we propose a hybrid model, called NMT+RNNG, that learns to parse and translate
by combining the recurrent neural network grammar into the attention-based
neural machine translation. Our approach encourages the neural machine
translation model to incorporate linguistic prior during training, and lets it
translate on its own afterward. Extensive experiments with four language pairs
show the effectiveness of the proposed NMT+RNNG.Comment: Accepted as a short paper at the 55th Annual Meeting of the
Association for Computational Linguistics (ACL 2017
Contextual Sequence Modeling for Recommendation with Recurrent Neural Networks
Recommendations can greatly benefit from good representations of the user
state at recommendation time. Recent approaches that leverage Recurrent Neural
Networks (RNNs) for session-based recommendations have shown that Deep Learning
models can provide useful user representations for recommendation. However,
current RNN modeling approaches summarize the user state by only taking into
account the sequence of items that the user has interacted with in the past,
without taking into account other essential types of context information such
as the associated types of user-item interactions, the time gaps between events
and the time of day for each interaction. To address this, we propose a new
class of Contextual Recurrent Neural Networks for Recommendation (CRNNs) that
can take into account the contextual information both in the input and output
layers and modifying the behavior of the RNN by combining the context embedding
with the item embedding and more explicitly, in the model dynamics, by
parametrizing the hidden unit transitions as a function of context information.
We compare our CRNNs approach with RNNs and non-sequential baselines and show
good improvements on the next event prediction task
- …