21,082 research outputs found
Benchmarking Graph Neural Networks
Graph neural networks (GNNs) have become the standard toolkit for analyzing
and learning from data on graphs. As the field grows, it becomes critical to
identify key architectures and validate new ideas that generalize to larger,
more complex datasets. Unfortunately, it has been increasingly difficult to
gauge the effectiveness of new models in the absence of a standardized
benchmark with consistent experimental settings. In this paper, we introduce a
reproducible GNN benchmarking framework, with the facility for researchers to
add new models conveniently for arbitrary datasets. We demonstrate the
usefulness of our framework by presenting a principled investigation into the
recent Weisfeiler-Lehman GNNs (WL-GNNs) compared to message passing-based graph
convolutional networks (GCNs) for a variety of graph tasks, i.e. graph
regression/classification and node/link prediction, with medium-scale datasets.Comment: Benchmarking framework on GitHub at
https://github.com/graphdeeplearning/benchmarking-gnn
Graph Generative Model for Benchmarking Graph Neural Networks
As the field of Graph Neural Networks (GNN) continues to grow, it experiences
a corresponding increase in the need for large, real-world datasets to train
and test new GNN models on challenging, realistic problems. Unfortunately, such
graph datasets are often generated from online, highly privacy-restricted
ecosystems, which makes research and development on these datasets hard, if not
impossible. This greatly reduces the amount of benchmark graphs available to
researchers, causing the field to rely only on a handful of publicly-available
datasets. To address this problem, we introduce a novel graph generative model,
Computation Graph Transformer (CGT) that learns and reproduces the distribution
of real-world graphs in a privacy-controlled way. More specifically, CGT (1)
generates effective benchmark graphs on which GNNs show similar task
performance as on the source graphs, (2) scales to process large-scale graphs,
(3) incorporates off-the-shelf privacy modules to guarantee end-user privacy of
the generated graph. Extensive experiments across a vast body of graph
generative models show that only our model can successfully generate
privacy-controlled, synthetic substitutes of large-scale real-world graphs that
can be effectively used to benchmark GNN models
IPC: A Benchmark Data Set for Learning with Graph-Structured Data
Benchmark data sets are an indispensable ingredient of the evaluation of
graph-based machine learning methods. We release a new data set, compiled from
International Planning Competitions (IPC), for benchmarking graph
classification, regression, and related tasks. Apart from the graph
construction (based on AI planning problems) that is interesting in its own
right, the data set possesses distinctly different characteristics from
popularly used benchmarks. The data set, named IPC, consists of two
self-contained versions, grounded and lifted, both including graphs of large
and skewedly distributed sizes, posing substantial challenges for the
computation of graph models such as graph kernels and graph neural networks.
The graphs in this data set are directed and the lifted version is acyclic,
offering the opportunity of benchmarking specialized models for directed
(acyclic) structures. Moreover, the graph generator and the labeling are
computer programmed; thus, the data set may be extended easily if a larger
scale is desired. The data set is accessible from
\url{https://github.com/IBM/IPC-graph-data}.Comment: ICML 2019 Workshop on Learning and Reasoning with Graph-Structured
Data. The data set is accessible from https://github.com/IBM/IPC-graph-dat
IPC: A Benchmark Data Set for Learning with Graph-Structured Data
Benchmark data sets are an indispensable ingredient of the evaluation of graph-based machine learning methods. We release a new data set, compiled from International Planning Competitions (IPC), for benchmarking graph classification, regression, and related tasks. Apart fromthe graph construction (based on AI planning problems) that is interesting in its own right, the data set possesses distinctly different characteristics from popularly used benchmarks. The dataset, named IPC, consists of two self-contained versions, grounded and lifted, both including graphs of large and skewedly distributed sizes,posing substantial challenges for the computation of graph models such as graph kernels and graph neural networks. The graphs in this data set are directed and the lifted version is acyclic, offering the opportunity of benchmarking specialized models for directed (acyclic) structures. Moreover, the graph generator and the labelingare computer programmed; thus, the data set may be extended easily if a larger scale is desired
Using Regular Languages to Explore the Representational Capacity of Recurrent Neural Architectures
The presence of Long Distance Dependencies (LDDs) in sequential data poses
significant challenges for computational models. Various recurrent neural
architectures have been designed to mitigate this issue. In order to test these
state-of-the-art architectures, there is growing need for rich benchmarking
datasets. However, one of the drawbacks of existing datasets is the lack of
experimental control with regards to the presence and/or degree of LDDs. This
lack of control limits the analysis of model performance in relation to the
specific challenge posed by LDDs. One way to address this is to use synthetic
data having the properties of subregular languages. The degree of LDDs within
the generated data can be controlled through the k parameter, length of the
generated strings, and by choosing appropriate forbidden strings. In this
paper, we explore the capacity of different RNN extensions to model LDDs, by
evaluating these models on a sequence of SPk synthesized datasets, where each
subsequent dataset exhibits a longer degree of LDD. Even though SPk are simple
languages, the presence of LDDs does have significant impact on the performance
of recurrent neural architectures, thus making them prime candidate in
benchmarking tasks.Comment: International Conference of Artificial Neural Networks (ICANN) 201
- …