1,198 research outputs found
Edge-labeling Graph Neural Network for Few-shot Learning
In this paper, we propose a novel edge-labeling graph neural network (EGNN),
which adapts a deep neural network on the edge-labeling graph, for few-shot
learning. The previous graph neural network (GNN) approaches in few-shot
learning have been based on the node-labeling framework, which implicitly
models the intra-cluster similarity and the inter-cluster dissimilarity. In
contrast, the proposed EGNN learns to predict the edge-labels rather than the
node-labels on the graph that enables the evolution of an explicit clustering
by iteratively updating the edge-labels with direct exploitation of both
intra-cluster similarity and the inter-cluster dissimilarity. It is also well
suited for performing on various numbers of classes without retraining, and can
be easily extended to perform a transductive inference. The parameters of the
EGNN are learned by episodic training with an edge-labeling loss to obtain a
well-generalizable model for unseen low-data problem. On both of the supervised
and semi-supervised few-shot image classification tasks with two benchmark
datasets, the proposed EGNN significantly improves the performances over the
existing GNNs.Comment: accepted to CVPR 201
Relational Graph Attention Networks
We investigate Relational Graph Attention Networks, a class of models that
extends non-relational graph attention mechanisms to incorporate relational
information, opening up these methods to a wider variety of problems. A
thorough evaluation of these models is performed, and comparisons are made
against established benchmarks. To provide a meaningful comparison, we retrain
Relational Graph Convolutional Networks, the spectral counterpart of Relational
Graph Attention Networks, and evaluate them under the same conditions. We find
that Relational Graph Attention Networks perform worse than anticipated,
although some configurations are marginally beneficial for modelling molecular
properties. We provide insights as to why this may be, and suggest both
modifications to evaluation strategies, as well as directions to investigate
for future work.Comment: 10 pages + 8 pages of appendices. Layer implementation available at
https://github.com/Babylonpartners/rgat
Learning to learn via Self-Critique
In few-shot learning, a machine learning system learns from a small set of
labelled examples relating to a specific task, such that it can generalize to
new examples of the same task. Given the limited availability of labelled
examples in such tasks, we wish to make use of all the information we can.
Usually a model learns task-specific information from a small training-set
(support-set) to predict on an unlabelled validation set (target-set). The
target-set contains additional task-specific information which is not utilized
by existing few-shot learning methods. Making use of the target-set examples
via transductive learning requires approaches beyond the current methods; at
inference time, the target-set contains only unlabelled input data-points, and
so discriminative learning cannot be used. In this paper, we propose a
framework called Self-Critique and Adapt or SCA, which learns to learn a
label-free loss function, parameterized as a neural network. A base-model
learns on a support-set using existing methods (e.g. stochastic gradient
descent combined with the cross-entropy loss), and then is updated for the
incoming target-task using the learnt loss function. This label-free loss
function is itself optimized such that the learnt model achieves higher
generalization performance. Experiments demonstrate that SCA offers
substantially reduced error-rates compared to baselines which only adapt on the
support-set, and results in state of the art benchmark performance on
Mini-ImageNet and Caltech-UCSD Birds 200.Comment: Accepted in NeurIPS 201
Information Extraction from Scientific Literature for Method Recommendation
As a research community grows, more and more papers are published each year.
As a result there is increasing demand for improved methods for finding
relevant papers, automatically understanding the key ideas and recommending
potential methods for a target problem. Despite advances in search engines, it
is still hard to identify new technologies according to a researcher's need.
Due to the large variety of domains and extremely limited annotated resources,
there has been relatively little work on leveraging natural language processing
in scientific recommendation. In this proposal, we aim at making scientific
recommendations by extracting scientific terms from a large collection of
scientific papers and organizing the terms into a knowledge graph. In
preliminary work, we trained a scientific term extractor using a small amount
of annotated data and obtained state-of-the-art performance by leveraging large
amount of unannotated papers through applying multiple semi-supervised
approaches. We propose to construct a knowledge graph in a way that can make
minimal use of hand annotated data, using only the extracted terms,
unsupervised relational signals such as co-occurrence, and structural external
resources such as Wikipedia. Latent relations between scientific terms can be
learned from the graph. Recommendations will be made through graph inference
for both observed and unobserved relational pairs.Comment: Thesis Proposal. arXiv admin note: text overlap with arXiv:1708.0607
Machine Learning on Graphs: A Model and Comprehensive Taxonomy
There has been a surge of recent interest in learning representations for
graph-structured data. Graph representation learning methods have generally
fallen into three main categories, based on the availability of labeled data.
The first, network embedding (such as shallow graph embedding or graph
auto-encoders), focuses on learning unsupervised representations of relational
structure. The second, graph regularized neural networks, leverages graphs to
augment neural network losses with a regularization objective for
semi-supervised learning. The third, graph neural networks, aims to learn
differentiable functions over discrete topologies with arbitrary structure.
However, despite the popularity of these areas there has been surprisingly
little work on unifying the three paradigms. Here, we aim to bridge the gap
between graph neural networks, network embedding and graph regularization
models. We propose a comprehensive taxonomy of representation learning methods
for graph-structured data, aiming to unify several disparate bodies of work.
Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which
generalizes popular algorithms for semi-supervised learning on graphs (e.g.
GraphSage, Graph Convolutional Networks, Graph Attention Networks), and
unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc)
into a single consistent approach. To illustrate the generality of this
approach, we fit over thirty existing methods into this framework. We believe
that this unifying view both provides a solid foundation for understanding the
intuition behind these methods, and enables future research in the area
Adversarial Attacks on Neural Networks for Graph Data
Deep learning models for graphs have achieved strong performance for the task
of node classification. Despite their proliferation, currently there is no
study of their robustness to adversarial attacks. Yet, in domains where they
are likely to be used, e.g. the web, adversaries are common. Can deep learning
models for graphs be easily fooled? In this work, we introduce the first study
of adversarial attacks on attributed graphs, specifically focusing on models
exploiting ideas of graph convolutions. In addition to attacks at test time, we
tackle the more challenging class of poisoning/causative attacks, which focus
on the training phase of a machine learning model. We generate adversarial
perturbations targeting the node's features and the graph structure, thus,
taking the dependencies between instances in account. Moreover, we ensure that
the perturbations remain unnoticeable by preserving important data
characteristics. To cope with the underlying discrete domain we propose an
efficient algorithm Nettack exploiting incremental computations. Our
experimental study shows that accuracy of node classification significantly
drops even when performing only few perturbations. Even more, our attacks are
transferable: the learned attacks generalize to other state-of-the-art node
classification models and unsupervised approaches, and likewise are successful
even when only limited knowledge about the graph is given.Comment: Accepted as a full paper at KDD 2018 on May 6, 201
Lifted Convex Quadratic Programming
Symmetry is the essential element of lifted inference that has recently
demon- strated the possibility to perform very efficient inference in
highly-connected, but symmetric probabilistic models models. This raises the
question, whether this holds for optimisation problems in general. Here we show
that for a large class of optimisation methods this is actually the case. More
precisely, we introduce the concept of fractional symmetries of convex
quadratic programs (QPs), which lie at the heart of many machine learning
approaches, and exploit it to lift, i.e., to compress QPs. These lifted QPs can
then be tackled with the usual optimization toolbox (off-the-shelf solvers,
cutting plane algorithms, stochastic gradients etc.). If the original QP
exhibits symmetry, then the lifted one will generally be more compact, and
hence their optimization is likely to be more efficient
Recurrent Collective Classification
We propose a new method for training iterative collective classifiers for
labeling nodes in network data. The iterative classification algorithm (ICA) is
a canonical method for incorporating relational information into
classification. Yet, existing methods for training ICA models rely on the
assumption that relational features reflect the true labels of the nodes. This
unrealistic assumption introduces a bias that is inconsistent with the actual
prediction algorithm. In this paper, we introduce recurrent collective
classification (RCC), a variant of ICA analogous to recurrent neural network
prediction. RCC accommodates any differentiable local classifier and relational
feature functions. We provide gradient-based strategies for optimizing over
model parameters to more directly minimize the loss function. In our
experiments, this direct loss minimization translates to improved accuracy and
robustness on real network data. We demonstrate the robustness of RCC in
settings where local classification is very noisy, settings that are
particularly challenging for ICA
Graph U-Nets
We consider the problem of representation learning for graph data.
Convolutional neural networks can naturally operate on images, but have
significant challenges in dealing with graph data. Given images are special
cases of graphs with nodes lie on 2D lattices, graph embedding tasks have a
natural correspondence with image pixel-wise prediction tasks such as
segmentation. While encoder-decoder architectures like U-Nets have been
successfully applied on many image pixel-wise prediction tasks, similar methods
are lacking for graph data. This is due to the fact that pooling and
up-sampling operations are not natural on graph data. To address these
challenges, we propose novel graph pooling (gPool) and unpooling (gUnpool)
operations in this work. The gPool layer adaptively selects some nodes to form
a smaller graph based on their scalar projection values on a trainable
projection vector. We further propose the gUnpool layer as the inverse
operation of the gPool layer. The gUnpool layer restores the graph into its
original structure using the position information of nodes selected in the
corresponding gPool layer. Based on our proposed gPool and gUnpool layers, we
develop an encoder-decoder model on graph, known as the graph U-Nets. Our
experimental results on node classification and graph classification tasks
demonstrate that our methods achieve consistently better performance than
previous models.Comment: 10 pages, ICML1
Linearized and Single-Pass Belief Propagation
How can we tell when accounts are fake or real in a social network? And how
can we tell which accounts belong to liberal, conservative or centrist users?
Often, we can answer such questions and label nodes in a network based on the
labels of their neighbors and appropriate assumptions of homophily ("birds of a
feather flock together") or heterophily ("opposites attract"). One of the most
widely used methods for this kind of inference is Belief Propagation (BP) which
iteratively propagates the information from a few nodes with explicit labels
throughout a network until convergence. One main problem with BP, however, is
that there are no known exact guarantees of convergence in graphs with loops.
This paper introduces Linearized Belief Propagation (LinBP), a linearization
of BP that allows a closed-form solution via intuitive matrix equations and,
thus, comes with convergence guarantees. It handles homophily, heterophily, and
more general cases that arise in multi-class settings. Plus, it allows a
compact implementation in SQL. The paper also introduces Single-pass Belief
Propagation (SBP), a "localized" version of LinBP that propagates information
across every edge at most once and for which the final class assignments depend
only on the nearest labeled neighbors. In addition, SBP allows fast incremental
updates in dynamic networks. Our runtime experiments show that LinBP and SBP
are orders of magnitude faster than standardComment: 17 pages, 11 figures, 4 algorithms. Includes following major changes
since v1: renaming of "turbo BP" to "single-pass BP", convergence criteria
now give sufficient *and* necessary conditions, more detailed experiments,
more detailed comparison with prior BP convergence results, overall improved
expositio
- …