57,793 research outputs found
Structural Imbalance Aware Graph Augmentation Learning
Graph machine learning (GML) has made great progress in node classification,
link prediction, graph classification and so on. However, graphs in reality are
often structurally imbalanced, that is, only a few hub nodes have a denser
local structure and higher influence. The imbalance may compromise the
robustness of existing GML models, especially in learning tail nodes. This
paper proposes a selective graph augmentation method (SAug) to solve this
problem. Firstly, a Pagerank-based sampling strategy is designed to identify
hub nodes and tail nodes in the graph. Secondly, a selective augmentation
strategy is proposed, which drops the noisy neighbors of hub nodes on one side,
and discovers the latent neighbors and generates pseudo neighbors for tail
nodes on the other side. It can also alleviate the structural imbalance between
two types of nodes. Finally, a GNN model will be retrained on the augmented
graph. Extensive experiments demonstrate that SAug can significantly improve
the backbone GNNs and achieve superior performance to its competitors of graph
augmentation methods and hub/tail aware methods.Comment: 13 pages, 11 figures, 7 table
Active Semi-Supervised Learning Using Sampling Theory for Graph Signals
We consider the problem of offline, pool-based active semi-supervised
learning on graphs. This problem is important when the labeled data is scarce
and expensive whereas unlabeled data is easily available. The data points are
represented by the vertices of an undirected graph with the similarity between
them captured by the edge weights. Given a target number of nodes to label, the
goal is to choose those nodes that are most informative and then predict the
unknown labels. We propose a novel framework for this problem based on our
recent results on sampling theory for graph signals. A graph signal is a
real-valued function defined on each node of the graph. A notion of frequency
for such signals can be defined using the spectrum of the graph Laplacian
matrix. The sampling theory for graph signals aims to extend the traditional
Nyquist-Shannon sampling theory by allowing us to identify the class of graph
signals that can be reconstructed from their values on a subset of vertices.
This approach allows us to define a criterion for active learning based on
sampling set selection which aims at maximizing the frequency of the signals
that can be reconstructed from their samples on the set. Experiments show the
effectiveness of our method.Comment: 10 pages, 6 figures, To appear in KDD'1
- …