1 research outputs found
Semi-Supervised Learning on Graphs through Reach and Distance Diffusion
Semi-supervised learning (SSL) is an indispensable tool when there are few
labeled entities and many unlabeled entities for which we want to predict
labels. With graph-based methods, entities correspond to nodes in a graph and
edges represent strong relations. At the heart of SSL algorithms is the
specification of a dense {\em kernel} of pairwise affinity values from the
graph structure. A learning algorithm is then trained on the kernel together
with labeled entities. The most popular kernels are {\em spectral} and include
the highly scalable "symmetric" Laplacian methods, that compute a soft labels
using Jacobi iterations, and "asymmetric" methods including Personalized Page
Rank (PPR) which use short random walks and apply with directed relations, such
as like, follow, or hyperlinks.
We introduce {\em Reach diffusion} and {\em Distance diffusion} kernels that
build on powerful social and economic models of centrality and influence in
networks and capture the directed pairwise relations that underline social
influence. Inspired by the success of social influence as an alternative to
spectral centrality such as Page Rank, we explore SSL with our kernels and
develop highly scalable algorithms for parameter setting, label learning, and
sampling. We perform preliminary experiments that demonstrate the properties
and potential of our kernels.Comment: 13 pages, 5 figure