Location of Repository

We consider the problem of approximating the PageRank of a target node using only local information provided by a link server. This problem was originally studied by Chen, Gan, and Suel (CIKM 2004), who presented an algorithm for tackling it. We prove that local approximation of PageRank, even to within modest approximation factors, is infeasible in the worst-case, as it requires probing the link server for Ω(n) nodes, where n is the size of the graph. The difficulty emanates from nodes of high in-degree and/or from slow convergence of the PageRank random walk. We show that when the graph has bounded in-degree and admits fast PageRank convergence, then local PageRank approximation can be done using a small number of queries. Unfortunately, natural graphs, such as the web graph, are abundant with high in-degree nodes, making this algorithm (or any other local approximation algorithm) too costly. On the other hand, reverse natural graphs tend to have low in-degree while maintaining fast PageRank convergence. It follows that calculating Reverse PageRank locally is frequently more feasible than computing PageRank locally. We demonstrate that Reverse PageRank is useful for several applications, including computation of hub scores for web pages, finding influencers in social networks, obtaining good seeds for crawling, and measurement of semantic relatedness between concepts in a taxonomy.

Year: 2008

OAI identifier:
oai:CiteSeerX.psu:10.1.1.135.5845

Provided by:
CiteSeerX

Download PDF:To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.