11,854 research outputs found
Graph-to-Sequence Learning using Gated Graph Neural Networks
Many NLP applications can be framed as a graph-to-sequence learning problem.
Previous work proposing neural architectures on this setting obtained promising
results compared to grammar-based approaches but still rely on linearisation
heuristics and/or standard recurrent networks to achieve the best performance.
In this work, we propose a new model that encodes the full structural
information contained in the graph. Our architecture couples the recently
proposed Gated Graph Neural Networks with an input transformation that allows
nodes and edges to have their own hidden representations, while tackling the
parameter explosion problem present in previous work. Experimental results show
that our model outperforms strong baselines in generation from AMR graphs and
syntax-based neural machine translation.Comment: ACL 201
Towards Neural Machine Translation with Latent Tree Attention
Building models that take advantage of the hierarchical structure of language
without a priori annotation is a longstanding goal in natural language
processing. We introduce such a model for the task of machine translation,
pairing a recurrent neural network grammar encoder with a novel attentional
RNNG decoder and applying policy gradient reinforcement learning to induce
unsupervised tree structures on both the source and target. When trained on
character-level datasets with no explicit segmentation or parse annotation, the
model learns a plausible segmentation and shallow parse, obtaining performance
close to an attentional baseline.Comment: Presented at SPNLP 201
Open Vocabulary Learning on Source Code with a Graph-Structured Cache
Machine learning models that take computer program source code as input
typically use Natural Language Processing (NLP) techniques. However, a major
challenge is that code is written using an open, rapidly changing vocabulary
due to, e.g., the coinage of new variable and method names. Reasoning over such
a vocabulary is not something for which most NLP methods are designed. We
introduce a Graph-Structured Cache to address this problem; this cache contains
a node for each new word the model encounters with edges connecting each word
to its occurrences in the code. We find that combining this graph-structured
cache strategy with recent Graph-Neural-Network-based models for supervised
learning on code improves the models' performance on a code completion task and
a variable naming task --- with over relative improvement on the latter
--- at the cost of a moderate increase in computation time.Comment: Published in the International Conference on Machine Learning (ICML
2019), 13 page
- …