A Consistent Diffusion-Based Algorithm for Semi-Supervised Graph Learning

Abstract

The task of semi-supervised classification aims at assigning labels to all nodes of a graph based on the labels known for a few nodes, called the seeds. One of the most popular algorithms relies on the principle of heat diffusion, where the labels of the seeds are spread by thermoconductance and the temperature of each node at equilibrium is used as a score function for each label. In this paper, we prove that this algorithm is not consistent unless the temperatures of the nodes at equilibrium are centered before scoring. This crucial step does not only make the algorithm provably consistent on a block model but brings significant performance gains on real graphs.Comment: arXiv admin note: substantial text overlap with arXiv:2008.1194

    Similar works

    Full text

    thumbnail-image

    Available Versions