3,457 research outputs found
Investigating and Mitigating Degree-Related Biases in Graph Convolutional Networks
Graph Convolutional Networks (GCNs) show promising results for
semi-supervised learning tasks on graphs, thus become favorable comparing with
other approaches. Despite the remarkable success of GCNs, it is difficult to
train GCNs with insufficient supervision. When labeled data are limited, the
performance of GCNs becomes unsatisfying for low-degree nodes. While some prior
work analyze successes and failures of GCNs on the entire model level,
profiling GCNs on individual node level is still underexplored.
In this paper, we analyze GCNs in regard to the node degree distribution.
From empirical observation to theoretical proof, we confirm that GCNs are
biased towards nodes with larger degrees with higher accuracy on them, even if
high-degree nodes are underrepresented in most graphs. We further develop a
novel Self-Supervised-Learning Degree-Specific GCN (SL-DSGC) that mitigate the
degree-related biases of GCNs from model and data aspects. Firstly, we propose
a degree-specific GCN layer that captures both discrepancies and similarities
of nodes with different degrees, which reduces the inner model-aspect biases of
GCNs caused by sharing the same parameters with all nodes. Secondly, we design
a self-supervised-learning algorithm that creates pseudo labels with
uncertainty scores on unlabeled nodes with a Bayesian neural network. Pseudo
labels increase the chance of connecting to labeled neighbors for low-degree
nodes, thus reducing the biases of GCNs from the data perspective. Uncertainty
scores are further exploited to weight pseudo labels dynamically in the
stochastic gradient descent for SL-DSGC. Experiments on three benchmark
datasets show SL-DSGC not only outperforms state-of-the-art
self-training/self-supervised-learning GCN methods, but also improves GCN
accuracy dramatically for low-degree nodes.Comment: Accepted to CIKM 202
Dynamic Self-training Framework for Graph Convolutional Networks
Graph neural networks (GNN) such as GCN, GAT, MoNet have achieved
state-of-the-art results on semi-supervised learning on graphs. However, when
the number of labeled nodes is very small, the performances of GNNs downgrade
dramatically. Self-training has proved to be effective for resolving this
issue, however, the performance of self-trained GCN is still inferior to that
of G2G and DGI for many settings. Moreover, additional model complexity make it
more difficult to tune the hyper-parameters and do model selection. We argue
that the power of self-training is still not fully explored for the node
classification task. In this paper, we propose a unified end-to-end
self-training framework called \emph{Dynamic Self-traning}, which generalizes
and simplifies prior work. A simple instantiation of the framework based on GCN
is provided and empirical results show that our framework outperforms all
previous methods including GNNs, embedding based method and self-trained GCNs
by a noticeable margin. Moreover, compared with standard self-training,
hyper-parameter tuning for our framework is easier.Comment: 11page
- …