24,121 research outputs found
Transition-based Semantic Dependency Parsing with Pointer Networks
[Abstract]: Transition-based parsers implemented with Pointer Networks have become the new state of the art in dependency parsing, excelling in producing labelled syntactic trees and outperforming graph-based models in this task. In order to further test the capabilities of these powerful neural networks on a harder NLP problem, we propose a transition system that, thanks to Pointer Networks, can straightforwardly produce labelled directed acyclic graphs and perform semantic dependency parsing. In addition, we enhance our approach with deep contextualized word embeddings extracted from BERT. The resulting system not only outperforms all existing transition-based models, but also matches the best fully-supervised accuracy to date on the SemEval 2015 Task 18 datasets among previous state-of-the-art graph-based parsers.This work has received funding from the European Research Council (ERC), under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150), from the ANSWER-ASAP project (TIN2017-85160-C2-1-R) from MINECO, and from Xunta de Galicia (ED431B 2017/01, ED431G 2019/01).Xunta de Galicia; ED431B 2017/01Xunta de Galicia; ED431G 2019/0
DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases
Keyphrase extraction from documents is useful to a variety of applications
such as information retrieval and document summarization. This paper presents
an end-to-end method called DivGraphPointer for extracting a set of diversified
keyphrases from a document. DivGraphPointer combines the advantages of
traditional graph-based ranking methods and recent neural network-based
approaches. Specifically, given a document, a word graph is constructed from
the document based on word proximity and is encoded with graph convolutional
networks, which effectively capture document-level word salience by modeling
long-range dependency between words in the document and aggregating multiple
appearances of identical words into one node. Furthermore, we propose a
diversified point network to generate a set of diverse keyphrases out of the
word graph in the decoding process. Experimental results on five benchmark data
sets show that our proposed method significantly outperforms the existing
state-of-the-art approaches.Comment: Accepted to SIGIR 201
An Encoder-decoder Architecture with Graph Convolutional Networks for Abstractive Summarization
We propose a single-document abstractive summarization system that integrates token relation into a traditional RNN-based encoder-decoder architecture. We employ pointer-wise mutual information to represent the token relation and adopt Graph Convolutional Networks (GCN) to extract token representation from the relation graph. In our experiment on Gigaword, we consider importing two kinds of structural information: token (node) representation from the relation graph. Also, we implement two kinds of GCNs, a spectral-based one and a spatial-based one, to extract structural information. The result shows that the spatial based GCN-enhanced model with node representation outperforms the classical RNN-based encoder-decoder model
Learning to Branch in Combinatorial Optimization with Graph Pointer Networks
Branch-and-bound is a typical way to solve combinatorial optimization
problems. This paper proposes a graph pointer network model for learning the
variable selection policy in the branch-and-bound. We extract the graph
features, global features and historical features to represent the solver
state. The proposed model, which combines the graph neural network and the
pointer mechanism, can effectively map from the solver state to the branching
variable decisions. The model is trained to imitate the classic strong
branching expert rule by a designed top-k Kullback-Leibler divergence loss
function. Experiments on a series of benchmark problems demonstrate that the
proposed approach significantly outperforms the widely used expert-designed
branching rules. Our approach also outperforms the state-of-the-art
machine-learning-based branch-and-bound methods in terms of solving speed and
search tree size on all the test instances. In addition, the model can
generalize to unseen instances and scale to larger instances
Open Vocabulary Learning on Source Code with a Graph-Structured Cache
Machine learning models that take computer program source code as input
typically use Natural Language Processing (NLP) techniques. However, a major
challenge is that code is written using an open, rapidly changing vocabulary
due to, e.g., the coinage of new variable and method names. Reasoning over such
a vocabulary is not something for which most NLP methods are designed. We
introduce a Graph-Structured Cache to address this problem; this cache contains
a node for each new word the model encounters with edges connecting each word
to its occurrences in the code. We find that combining this graph-structured
cache strategy with recent Graph-Neural-Network-based models for supervised
learning on code improves the models' performance on a code completion task and
a variable naming task --- with over relative improvement on the latter
--- at the cost of a moderate increase in computation time.Comment: Published in the International Conference on Machine Learning (ICML
2019), 13 page
- …