The problem of detecting dense subgraphs (\emph{communities}) in large sparse graphs is inherent to many real world domains like social networking. A popular approach of detecting these communities involves first computing the \emph{probability~vectors} for \emph{random~walks} on the graph for a fixed number of steps, and then using these probability vectors to detect the communities. Such an approach has been discussed by Latapy and Pons in \cite{latapypons}. They compute the probability vectors using simple matrix multiplication and define a measure of the structural similarity between vertices which they call \emph{distance}. Based on the probability vectors, they compute the distances between vertices and then based on these distances group the vertices into communities. Their algorithm takes $O(n^2\log n)$ time where $n$ is the number of vertices in the graph. We focus on the first part of the approach i.e. computation of the probability vectors for the random walks, and propose a more efficient algorithm (than matrix multiplication) for computing these vectors in time complexity that is linear in the size of the output

Goel, Gaurav

INRIA a CCSD electronic archive server

Computing the Probability Vectors for Random Walks on Graphs with Bounded Arboricity

The problem of detecting dense subgraphs (communities) in large sparse graphs is inherent to many real world domains like social networking. A popular approach of detecting these communities involves first computing the probability vectors for random walks on the graph for a fixed number of steps, and then using these probability vectors to detect the communities. Such an approach has been discussed by Latapy and Pons in [5]. They com-pute the probability vectors using simple matrix multiplication and define a measure of the structural similarity between vertices which they call distance. Based on the probability vectors, they compute the distances between ver-tices and then based on these distances group the vertices into communities. Their algorithm takes O(n2 log n) time where n is the number of vertices in the graph. We focus on the first part of the approach i.e. computation of the probability vectors for the random walks, and propose a more efficient algorithm (than matrix multiplication) for computing these vectors in time complexity that is linear in the size of the output. ∗This internship has been supported by the joint IIT-INRIA internship programme and a grant by the Région Lorraine. 

Gaurav Goel

CiteSeerX

https://hal.inria.fr/inria-00000578/file/report-random-walks.pdf

Computing the Probability Vectors for Random Walks on Graphs with Bounded Arboricity

Abstract

Similar works

Full text

Available Versions

INRIA a CCSD electronic archive server

CiteSeerX