1,600 research outputs found
Invariance-Preserving Localized Activation Functions for Graph Neural Networks
Graph signals are signals with an irregular structure that can be described
by a graph. Graph neural networks (GNNs) are information processing
architectures tailored to these graph signals and made of stacked layers that
compose graph convolutional filters with nonlinear activation functions. Graph
convolutions endow GNNs with invariance to permutations of the graph nodes'
labels. In this paper, we consider the design of trainable nonlinear activation
functions that take into consideration the structure of the graph. This is
accomplished by using graph median filters and graph max filters, which mimic
linear graph convolutions and are shown to retain the permutation invariance of
GNNs. We also discuss modifications to the backpropagation algorithm necessary
to train local activation functions. The advantages of localized activation
function architectures are demonstrated in four numerical experiments: source
localization on synthetic graphs, authorship attribution of 19th century
novels, movie recommender systems and scientific article classification. In all
cases, localized activation functions are shown to improve model capacity.Comment: Accepted at TS
Axiomatic Attribution for Deep Networks
We study the problem of attributing the prediction of a deep network to its
input features, a problem previously studied by several other works. We
identify two fundamental axioms---Sensitivity and Implementation Invariance
that attribution methods ought to satisfy. We show that they are not satisfied
by most known attribution methods, which we consider to be a fundamental
weakness of those methods. We use the axioms to guide the design of a new
attribution method called Integrated Gradients. Our method requires no
modification to the original network and is extremely simple to implement; it
just needs a few calls to the standard gradient operator. We apply this method
to a couple of image models, a couple of text models and a chemistry model,
demonstrating its ability to debug networks, to extract rules from a network,
and to enable users to engage with models better
MeshCNN: A Network with an Edge
Polygonal meshes provide an efficient representation for 3D shapes. They
explicitly capture both shape surface and topology, and leverage non-uniformity
to represent large flat regions as well as sharp, intricate features. This
non-uniformity and irregularity, however, inhibits mesh analysis efforts using
neural networks that combine convolution and pooling operations. In this paper,
we utilize the unique properties of the mesh for a direct analysis of 3D shapes
using MeshCNN, a convolutional neural network designed specifically for
triangular meshes. Analogous to classic CNNs, MeshCNN combines specialized
convolution and pooling layers that operate on the mesh edges, by leveraging
their intrinsic geodesic connections. Convolutions are applied on edges and the
four edges of their incident triangles, and pooling is applied via an edge
collapse operation that retains surface topology, thereby, generating new mesh
connectivity for the subsequent convolutions. MeshCNN learns which edges to
collapse, thus forming a task-driven process where the network exposes and
expands the important features while discarding the redundant ones. We
demonstrate the effectiveness of our task-driven pooling on various learning
tasks applied to 3D meshes.Comment: For a two-minute explanation video see https://bit.ly/meshcnnvide
Machine Learning on Graphs: A Model and Comprehensive Taxonomy
There has been a surge of recent interest in learning representations for
graph-structured data. Graph representation learning methods have generally
fallen into three main categories, based on the availability of labeled data.
The first, network embedding (such as shallow graph embedding or graph
auto-encoders), focuses on learning unsupervised representations of relational
structure. The second, graph regularized neural networks, leverages graphs to
augment neural network losses with a regularization objective for
semi-supervised learning. The third, graph neural networks, aims to learn
differentiable functions over discrete topologies with arbitrary structure.
However, despite the popularity of these areas there has been surprisingly
little work on unifying the three paradigms. Here, we aim to bridge the gap
between graph neural networks, network embedding and graph regularization
models. We propose a comprehensive taxonomy of representation learning methods
for graph-structured data, aiming to unify several disparate bodies of work.
Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which
generalizes popular algorithms for semi-supervised learning on graphs (e.g.
GraphSage, Graph Convolutional Networks, Graph Attention Networks), and
unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc)
into a single consistent approach. To illustrate the generality of this
approach, we fit over thirty existing methods into this framework. We believe
that this unifying view both provides a solid foundation for understanding the
intuition behind these methods, and enables future research in the area
A Comprehensive Survey on Graph Neural Networks
Deep learning has revolutionized many machine learning tasks in recent years,
ranging from image classification and video processing to speech recognition
and natural language understanding. The data in these tasks are typically
represented in the Euclidean space. However, there is an increasing number of
applications where data are generated from non-Euclidean domains and are
represented as graphs with complex relationships and interdependency between
objects. The complexity of graph data has imposed significant challenges on
existing machine learning algorithms. Recently, many studies on extending deep
learning approaches for graph data have emerged. In this survey, we provide a
comprehensive overview of graph neural networks (GNNs) in data mining and
machine learning fields. We propose a new taxonomy to divide the
state-of-the-art graph neural networks into four categories, namely recurrent
graph neural networks, convolutional graph neural networks, graph autoencoders,
and spatial-temporal graph neural networks. We further discuss the applications
of graph neural networks across various domains and summarize the open source
codes, benchmark data sets, and model evaluation of graph neural networks.
Finally, we propose potential research directions in this rapidly growing
field.Comment: Minor revision (updated tables and references
Convolutional Neural Networks for Fast Approximation of Graph Edit Distance
Graph Edit Distance (GED) computation is a core operation of many widely-used
graph applications, such as graph classification, graph matching, and graph
similarity search. However, computing the exact GED between two graphs is
NP-complete. Most current approximate algorithms are based on solving a
combinatorial optimization problem, which involves complicated design and high
time complexity. In this paper, we propose a novel end-to-end neural network
based approach to GED approximation, aiming to alleviate the computational
burden while preserving good performance. The proposed approach, named GSimCNN,
turns GED computation into a learning problem. Each graph is considered as a
set of nodes, represented by learnable embedding vectors. The GED computation
is then considered as a two-set matching problem, where a higher matching score
leads to a lower GED. A Convolutional Neural Network (CNN) based approach is
proposed to tackle the set matching problem. We test our algorithm on three
real graph datasets, and our model achieves significant performance enhancement
against state-of-the-art approximate GED computation algorithms.Comment: arXiv admin note: text overlap with arXiv:1808.0568
Griffiths phases and the stretching of criticality in brain networks
Hallmarks of criticality, such as power-laws and scale invariance, have been
empirically found in cortical networks and it has been conjectured that
operating at criticality entails functional advantages, such as optimal
computational capabilities, memory, and large dynamical ranges. As critical
behavior requires a high degree of fine tuning to emerge, some type of
self-tuning mechanism needs to be invoked. Here we show that, taking into
account the complex hierarchical-modular architecture of cortical networks, the
singular critical point is replaced by an extended critical-like region which
corresponds --in the jargon of statistical mechanics-- to a Griffiths phase.
Using computational and analytical approaches, we find Griffiths phases in
synthetic hierarchical networks and also in empirical brain networks such as
the human connectome and the caenorhabditis elegans one. Stretched critical
regions, stemming from structural disorder, yield enhanced functionality in a
generic way, facilitating the task of self-organizing, adaptive, and
evolutionary mechanisms selecting for criticality.Comment: Final version. A misprint in Equation (2) was corrected.
Supplementary Information include
Provably Powerful Graph Networks
Recently, the Weisfeiler-Lehman (WL) graph isomorphism test was used to
measure the expressive power of graph neural networks (GNN). It was shown that
the popular message passing GNN cannot distinguish between graphs that are
indistinguishable by the 1-WL test (Morris et al. 2018; Xu et al. 2019).
Unfortunately, many simple instances of graphs are indistinguishable by the
1-WL test.
In search for more expressive graph learning models we build upon the recent
k-order invariant and equivariant graph neural networks (Maron et al. 2019a,b)
and present two results:
First, we show that such k-order networks can distinguish between
non-isomorphic graphs as good as the k-WL tests, which are provably stronger
than the 1-WL test for k>2. This makes these models strictly stronger than
message passing models. Unfortunately, the higher expressiveness of these
models comes with a computational cost of processing high order tensors.
Second, setting our goal at building a provably stronger, simple and scalable
model we show that a reduced 2-order network containing just scaled identity
operator, augmented with a single quadratic operation (matrix multiplication)
has a provable 3-WL expressive power. Differently put, we suggest a simple
model that interleaves applications of standard Multilayer-Perceptron (MLP)
applied to the feature dimension and matrix multiplication. We validate this
model by presenting state of the art results on popular graph classification
and regression tasks. To the best of our knowledge, this is the first practical
invariant/equivariant model with guaranteed 3-WL expressiveness, strictly
stronger than message passing models
Inductive Representation Learning on Large Graphs
Low-dimensional embeddings of nodes in large graphs have proved extremely
useful in a variety of prediction tasks, from content recommendation to
identifying protein functions. However, most existing approaches require that
all nodes in the graph are present during training of the embeddings; these
previous approaches are inherently transductive and do not naturally generalize
to unseen nodes. Here we present GraphSAGE, a general, inductive framework that
leverages node feature information (e.g., text attributes) to efficiently
generate node embeddings for previously unseen data. Instead of training
individual embeddings for each node, we learn a function that generates
embeddings by sampling and aggregating features from a node's local
neighborhood. Our algorithm outperforms strong baselines on three inductive
node-classification benchmarks: we classify the category of unseen nodes in
evolving information graphs based on citation and Reddit post data, and we show
that our algorithm generalizes to completely unseen graphs using a multi-graph
dataset of protein-protein interactions.Comment: Published in NIPS 2017; version with full appendix and minor
correction
Emulating malware authors for proactive protection using GANs over a distributed image visualization of dynamic file behavior
Malware authors have always been at an advantage of being able to
adversarially test and augment their malicious code, before deploying the
payload, using anti-malware products at their disposal. The anti-malware
developers and threat experts, on the other hand, do not have such a privilege
of tuning anti-malware products against zero-day attacks pro-actively. This
allows the malware authors to being a step ahead of the anti-malware products,
fundamentally biasing the cat and mouse game played by the two parties. In this
paper, we propose a way that would enable machine learning based threat
prevention models to bridge that gap by being able to tune against a deep
generative adversarial network (GAN), which takes up the role of a malware
author and generates new types of malware. The GAN is trained over a reversible
distributed RGB image representation of known malware behaviors, encoding the
sequence of API call ngrams and the corresponding term frequencies. The
generated images represent synthetic malware that can be decoded back to the
underlying API call sequence information. The image representation is not only
demonstrated as a general technique of incorporating necessary priors for
exploiting convolutional neural network architectures for generative or
discriminative modeling, but also as a visualization method for easy manual
software or malware categorization, by having individual API ngram information
distributed across the image space. In addition, we also propose using
smart-definitions for detecting malwares based on perceptual hashing of these
images. Such hashes are potentially more effective than cryptographic hashes
that do not carry any meaningful similarity metric, and hence, do not
generalize well.Comment: 22 pages, 12 figures, 4 table
- …