67 research outputs found
Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting
While Graph Neural Networks (GNNs) have achieved remarkable results in a
variety of applications, recent studies exposed important shortcomings in their
ability to capture the structure of the underlying graph. It has been shown
that the expressive power of standard GNNs is bounded by the Weisfeiler-Leman
(WL) graph isomorphism test, from which they inherit proven limitations such as
the inability to detect and count graph substructures. On the other hand, there
is significant empirical evidence, e.g. in network science and bioinformatics,
that substructures are often intimately related to downstream tasks. To this
end, we propose "Graph Substructure Networks" (GSN), a topologically-aware
message passing scheme based on substructure encoding. We theoretically analyse
the expressive power of our architecture, showing that it is strictly more
expressive than the WL test, and provide sufficient conditions for
universality. Importantly, we do not attempt to adhere to the WL hierarchy;
this allows us to retain multiple attractive properties of standard GNNs such
as locality and linear network complexity, while being able to disambiguate
even hard instances of graph isomorphism. We perform an extensive experimental
evaluation on graph classification and regression tasks and obtain
state-of-the-art results in diverse real-world settings including molecular
graphs and social networks. The code is publicly available at
https://github.com/gbouritsas/graph-substructure-networks
Accurate and highly interpretable prediction of gene expression from histone modifications
Histone Mark Modifications (HMs) are crucial actors in gene regulation, as they actively remodel chromatin to modulate transcriptional activity: aberrant combinatorial patterns of HMs have been connected with several diseases, including cancer. HMs are, however, reversible modifications: understanding their role in disease would allow the design of 'epigenetic drugs' for specific, non-invasive treatments. Standard statistical techniques were not entirely successful in extracting representative features from raw HM signals over gene locations. On the other hand, deep learning approaches allow for effective automatic feature extraction, but at the expense of model interpretation
Edge Directionality Improves Learning on Heterophilic Graphs
Graph Neural Networks (GNNs) have become the de-facto standard tool for
modeling relational data. However, while many real-world graphs are directed,
the majority of today's GNN models discard this information altogether by
simply making the graph undirected. The reasons for this are historical: 1)
many early variants of spectral GNNs explicitly required undirected graphs, and
2) the first benchmarks on homophilic graphs did not find significant gain from
using direction. In this paper, we show that in heterophilic settings, treating
the graph as directed increases the effective homophily of the graph,
suggesting a potential gain from the correct use of directionality information.
To this end, we introduce Directed Graph Neural Network (Dir-GNN), a novel
general framework for deep learning on directed graphs. Dir-GNN can be used to
extend any Message Passing Neural Network (MPNN) to account for edge
directionality information by performing separate aggregations of the incoming
and outgoing edges. We prove that Dir-GNN matches the expressivity of the
Directed Weisfeiler-Lehman test, exceeding that of conventional MPNNs. In
extensive experiments, we validate that while our framework leaves performance
unchanged on homophilic datasets, it leads to large gains over base models such
as GCN, GAT and GraphSage on heterophilic benchmarks, outperforming much more
complex methods and achieving new state-of-the-art results
Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks
The pairwise interaction paradigm of graph machine learning has predominantly
governed the modelling of relational systems. However, graphs alone cannot
capture the multi-level interactions present in many complex systems and the
expressive power of such schemes was proven to be limited. To overcome these
limitations, we propose Message Passing Simplicial Networks (MPSNs), a class of
models that perform message passing on simplicial complexes (SCs). To
theoretically analyse the expressivity of our model we introduce a Simplicial
Weisfeiler-Lehman (SWL) colouring procedure for distinguishing non-isomorphic
SCs. We relate the power of SWL to the problem of distinguishing non-isomorphic
graphs and show that SWL and MPSNs are strictly more powerful than the WL test
and not less powerful than the 3-WL test. We deepen the analysis by comparing
our model with traditional graph neural networks (GNNs) with ReLU activations
in terms of the number of linear regions of the functions they can represent.
We empirically support our theoretical claims by showing that MPSNs can
distinguish challenging strongly regular graphs for which GNNs fail and, when
equipped with orientation equivariant layers, they can improve classification
accuracy in oriented SCs compared to a GNN baseline.Comment: ICML 2021. Contains 27 pages, 9 figure
Graph Neural Networks for Link Prediction with Subgraph Sketching
Many Graph Neural Networks (GNNs) perform poorly compared to simple
heuristics on Link Prediction (LP) tasks. This is due to limitations in
expressive power such as the inability to count triangles (the backbone of most
LP heuristics) and because they can not distinguish automorphic nodes (those
having identical structural roles). Both expressiveness issues can be
alleviated by learning link (rather than node) representations and
incorporating structural features such as triangle counts. Since explicit link
representations are often prohibitively expensive, recent works resorted to
subgraph-based methods, which have achieved state-of-the-art performance for
LP, but suffer from poor efficiency due to high levels of redundancy between
subgraphs. We analyze the components of subgraph GNN (SGNN) methods for link
prediction. Based on our analysis, we propose a novel full-graph GNN called
ELPH (Efficient Link Prediction with Hashing) that passes subgraph sketches as
messages to approximate the key components of SGNNs without explicit subgraph
construction. ELPH is provably more expressive than Message Passing GNNs
(MPNNs). It outperforms existing SGNN models on many standard LP benchmarks
while being orders of magnitude faster. However, it shares the common GNN
limitation that it is only efficient when the dataset fits in GPU memory.
Accordingly, we develop a highly scalable model, called BUDDY, which uses
feature precomputation to circumvent this limitation without sacrificing
predictive performance. Our experiments show that BUDDY also outperforms SGNNs
on standard LP benchmarks while being highly scalable and faster than ELPH.Comment: 29 pages, 19 figures, 6 appendice
- …