3 research outputs found
Graph Convolutional Networks Meet with High Dimensionality Reduction
Recently, Graph Convolutional Networks (GCNs) and their variants have been
receiving many research interests for learning graph-related tasks. While the
GCNs have been successfully applied to this problem, some caveats inherited
from classical deep learning still remain as open research topics in the
context of the node classification problem. One such inherited caveat is that
GCNs only consider the nodes that are a few propagations away from the labeled
nodes to classify them. However, taking only a few propagation steps away nodes
into account defeats the purpose of using the graph topological information in
the GCNs. To remedy this problem, the-state-of-the-art methods leverage the
network diffusion approaches, namely personalized page rank and its variants,
to fully account for the graph topology, {\em after} they use the Neural
Networks in the GCNs. However, these approaches overlook the fact that the
network diffusion methods favour high degree nodes in the graph, resulting in
the propagation of labels to unlabeled centralized, hub, nodes. To address this
biasing hub nodes problem, in this paper, we propose to utilize a
dimensionality reduction technique conjugate with personalized page rank so
that we can both take advantage from graph topology and resolve the hub node
favouring problem for GCNs. Here, our approach opens a new holistic road for
message passing phase of GCNs by suggesting the usage of other proximity
matrices instead of well-known Laplacian. Testing on two real-world networks
that are commonly used in benchmarking GCNs' performance for the node
classification context, we systematically evaluate the performance of the
proposed methodology and show that our approach outperforms existing methods
for wide ranges of parameter values with very limited deep learning training
{\em epochs}
Expanding Label Sets for Graph Convolutional Networks
In recent years, Graph Convolutional Networks (GCNs) and their variants have
been widely utilized in learning tasks that involve graphs. These tasks include
recommendation systems, node classification, among many others. In node
classification problem, the input is a graph in which the edges represent the
association between pairs of nodes, multi-dimensional feature vectors are
associated with the nodes, and some of the nodes in the graph have known
labels. The objective is to predict the labels of the nodes that are not
labeled, using the nodes features, in conjunction with graph topology. While
GCNs have been successfully applied to this problem, the caveats that they
inherit from traditional deep learning models pose significant challenges to
broad utilization of GCNs in node classification. One such caveat is that
training a GCN requires a large number of labeled training instances, which is
often not the case in realistic settings. To remedy this requirement,
state-of-the-art methods leverage network diffusion-based approaches to
propagate labels across the network before training GCNs. However, these
approaches ignore the tendency of the network diffusion methods in biasing
proximity with centrality, resulting in the propagation of labels to the nodes
that are well-connected in the graph. To address this problem, here we present
an alternate approach to extrapolating node labels in GCNs in the following
three steps: (i) clustering of the network to identify communities, (ii) use of
network diffusion algorithms to quantify the proximity of each node to the
communities, thereby obtaining a low-dimensional topological profile for each
node, (iii) comparing these topological profiles to identify nodes that are
most similar to the labeled nodes
Fast Computation of Katz Index for Efficient Processing of Link Prediction Queries
Network proximity computations are among the most common operations in
various data mining applications, including link prediction and collaborative
filtering. A common measure of network proximity is Katz index, which has been
shown to be among the best-performing path-based link prediction algorithms.
With the emergence of very large network databases, such proximity computations
become an important part of query processing in these databases. Consequently,
significant effort has been devoted to developing algorithms for efficient
computation of Katz index between a given pair of nodes or between a query node
and every other node in the network. Here, we present LRC-Katz, an algorithm
based on indexing and low-rank correction to accelerate Katz index-based
network proximity queries. Using a variety of very large real-world networks,
we show that LRC-Katz outperforms the fastest existing method, Conjugate
Gradient, for a wide range of parameter values. We also show that this
acceleration in the computation of Katz index can be used to drastically
improve the efficiency of processing link prediction queries in very large
networks. Motivated by this observation, we propose a new link prediction
algorithm that exploits modularity of networks that are encountered in
practical applications. Our experimental results on the link prediction problem
show that our modularity based algorithm significantly outperforms the
state-of-the-art link prediction Katz method