Search CORE

6 research outputs found

Graph Residual Flow for Molecular Graph Generation

Author: Akita Hirotaka
Honda Shion
Ishiguro Katsuhiko
Nakanishi Toshiki
Oono Kenta
Publication venue
Publication date: 30/09/2019
Field of study

Statistical generative models for molecular graphs attract attention from many researchers from the fields of bio- and chemo-informatics. Among these models, invertible flow-based approaches are not fully explored yet. In this paper, we propose a powerful invertible flow for molecular graphs, called graph residual flow (GRF). The GRF is based on residual flows, which are known for more flexible and complex non-linear mappings than traditional coupling flows. We theoretically derive non-trivial conditions such that GRF is invertible, and present a way of keeping the entire flows invertible throughout the training and sampling. Experimental results show that a generative model based on the proposed GRF achieves comparable generation performance, with much smaller number of trainable parameters compared to the existing flow-based model

arXiv.org e-Print Archive

Graph Neural Networks Exponentially Lose Expressive Power for Node Classification

Author: Oono Kenta
Suzuki Taiji
Publication venue
Publication date: 06/01/2021
Field of study

Graph Neural Networks (graph NNs) are a promising deep learning approach for analyzing graph-structured data. However, it is known that they do not improve (or sometimes worsen) their predictive performance as we pile up many layers and add non-lineality. To tackle this problem, we investigate the expressive power of graph NNs via their asymptotic behaviors as the layer size tends to infinity. Our strategy is to generalize the forward propagation of a Graph Convolutional Network (GCN), which is a popular graph NN variant, as a specific dynamical system. In the case of a GCN, we show that when its weights satisfy the conditions determined by the spectra of the (augmented) normalized Laplacian, its output exponentially approaches the set of signals that carry information of the connected components and node degrees only for distinguishing nodes. Our theory enables us to relate the expressive power of GCNs with the topological information of the underlying graphs inherent in the graph spectra. To demonstrate this, we characterize the asymptotic behavior of GCNs on the Erd\H{o}s -- R\'{e}nyi graph. We show that when the Erd\H{o}s -- R\'{e}nyi graph is sufficiently dense and large, a broad range of GCNs on it suffers from the "information loss" in the limit of infinite layers with high probability. Based on the theory, we provide a principled guideline for weight normalization of graph NNs. We experimentally confirm that the proposed weight scaling enhances the predictive performance of GCNs in real data. Code is available at https://github.com/delta2323/gnn-asymptotics.Comment: 9 pages, Supplemental material 28 pages. Accepted in International Conference on Learning Representations (ICLR) 202

arXiv.org e-Print Archive

Understanding and Resolving Performance Degradation in Graph Convolutional Networks

Author: Dong Yanfei
Feng Jiashi
Hooi Bryan
Lee Wee Sun
Wang Kaixin
Xu Huan
Zhou Kuangqi
Publication venue
Publication date: 20/12/2020
Field of study

A Graph Convolutional Network (GCN) stacks several layers and in each layer performs a PROPagation operation (PROP) and a TRANsformation operation (TRAN) for learning node representations over graph-structured data. Though powerful, GCNs tend to suffer performance drop when the model gets deep. Previous works focus on PROPs to study and mitigate this issue, but the role of TRANs is barely investigated. In this work, we study performance degradation of GCNs by experimentally examining how stacking only TRANs or PROPs works. We find that TRANs contribute significantly, or even more than PROPs, to declining performance, and moreover that they tend to amplify node-wise feature variance in GCNs, causing variance inflamation that we identify as a key factor for causing performance drop. Motivated by such observations, we propose a variance-controlling technique termed Node Normalization (NodeNorm), which scales each node's features using its own standard deviation. Experimental results validate the effectiveness of NodeNorm on addressing performance degradation of GCNs. Specifically, it enables deep GCNs to achieve comparable results with shallow ones on 6 benchmark datasets, and to outperform shallow ones in cases where deep models are needed. NodeNorm is a generic plug-in and can well generalize to other GNN architectures.Comment: Code is available at <https://github.com/miafei/NodeNorm

arXiv.org e-Print Archive

Policy-GNN: Aggregation Optimization for Graph Neural Networks

Author: Hu Xia
Lai Kwei-Herng
Zha Daochen
Zhou Kaixiong
Publication venue
Publication date: 26/06/2020
Field of study

Graph data are pervasive in many real-world applications. Recently, increasing attention has been paid on graph neural networks (GNNs), which aim to model the local graph structures and capture the hierarchical patterns by aggregating the information from neighbors with stackable network modules. Motivated by the observation that different nodes often require different iterations of aggregation to fully capture the structural information, in this paper, we propose to explicitly sample diverse iterations of aggregation for different nodes to boost the performance of GNNs. It is a challenging task to develop an effective aggregation strategy for each node, given complex graphs and sparse features. Moreover, it is not straightforward to derive an efficient algorithm since we need to feed the sampled nodes into different number of network layers. To address the above challenges, we propose Policy-GNN, a meta-policy framework that models the sampling procedure and message passing of GNNs into a combined learning process. Specifically, Policy-GNN uses a meta-policy to adaptively determine the number of aggregations for each node. The meta-policy is trained with deep reinforcement learning (RL) by exploiting the feedback from the model. We further introduce parameter sharing and a buffer mechanism to boost the training efficiency. Experimental results on three real-world benchmark datasets suggest that Policy-GNN significantly outperforms the state-of-the-art alternatives, showing the promise in aggregation optimization for GNNs.Comment: Accepted by ACM SIGKDD'20 research trac

arXiv.org e-Print Archive

Self-Supervised Graph Transformer on Large-Scale Molecular Data

Author: Bian Yatao
Huang Junzhou
Huang Wenbing
Rong Yu
Wei Ying
Xie Weiyang
Xu Tingyang
Publication venue
Publication date: 28/10/2020
Field of study

How to obtain informative representations of molecules is a crucial prerequisite in AI-driven drug design and discovery. Recent researches abstract molecules as graphs and employ Graph Neural Networks (GNNs) for molecular representation learning. Nevertheless, two issues impede the usage of GNNs in real scenarios: (1) insufficient labeled molecules for supervised training; (2) poor generalization capability to new-synthesized molecules. To address them both, we propose a novel framework, GROVER, which stands for Graph Representation frOm self-superVised mEssage passing tRansformer. With carefully designed self-supervised tasks in node-, edge- and graph-level, GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data. Rather, to encode such complex information, GROVER integrates Message Passing Networks into the Transformer-style architecture to deliver a class of more expressive encoders of molecules. The flexibility of GROVER allows it to be trained efficiently on large-scale molecular dataset without requiring any supervision, thus being immunized to the two issues mentioned above. We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning. We then leverage the pre-trained GROVER for molecular property prediction followed by task-specific fine-tuning, where we observe a huge improvement (more than 6% on average) from current state-of-the-art methods on 11 challenging benchmarks. The insights we gained are that well-designed self-supervision losses and largely-expressive pre-trained models enjoy the significant potential on performance boosting.Comment: 17 pages, 7 figure

arXiv.org e-Print Archive

DeGNN: Characterizing and Improving Graph Neural Networks with Graph Decomposition

Author: Cui Bin
Gürel Nezihe Merve
Han Zhichao
Li Bo
Miao Xupeng
Min Wei
Rao Xi
Ren Hansheng
Shan Yinan
Shao Yingxia
Wang Yujie
Wang Yujing
Wu Fan
Xue Hui
Yang Yaming
Zhang Ce
Zhang Shuai
Zhang Wentao
Zhang Zitao
Zhao Yang
Publication venue
Publication date: 29/06/2020
Field of study

Despite the wide application of Graph Convolutional Network (GCN), one major limitation is that it does not benefit from the increasing depth and suffers from the oversmoothing problem. In this work, we first characterize this phenomenon from the information-theoretic perspective and show that under certain conditions, the mutual information between the output after

l

layers and the input of GCN converges to 0 exponentially with respect to

l

. We also show that, on the other hand, graph decomposition can potentially weaken the condition of such convergence rate, which enabled our analysis for GraphCNN. While different graph structures can only benefit from the corresponding decomposition, in practice, we propose an automatic connectivity-aware graph decomposition algorithm, DeGNN, to improve the performance of general graph neural networks. Extensive experiments on widely adopted benchmark datasets demonstrate that DeGNN can not only significantly boost the performance of corresponding GNNs, but also achieves the state-of-the-art performances.Comment: 20 pages, 5 figures, 5 table

arXiv.org e-Print Archive