We study a new notion of graph centrality based on absorbing random walks.
Given a graph G=(V,E) and a set of query nodes QβV, we aim to
identify the k most central nodes in G with respect to Q. Specifically,
we consider central nodes to be absorbing for random walks that start at the
query nodes Q. The goal is to find the set of k central nodes that
minimizes the expected length of a random walk until absorption. The proposed
measure, which we call k absorbing random-walk centrality, favors diverse
sets, as it is beneficial to place the k absorbing nodes in different parts
of the graph so as to "intercept" random walks that start from different query
nodes.
Although similar problem definitions have been considered in the literature,
e.g., in information-retrieval settings where the goal is to diversify
web-search results, in this paper we study the problem formally and prove some
of its properties. We show that the problem is NP-hard, while the objective
function is monotone and supermodular, implying that a greedy algorithm
provides solutions with an approximation guarantee. On the other hand, the
greedy algorithm involves expensive matrix operations that make it prohibitive
to employ on large datasets. To confront this challenge, we develop more
efficient algorithms based on spectral clustering and on personalized PageRank.Comment: 11 pages, 11 figures, short paper to appear at ICDM 201