On the Analysis of a Label Propagation Algorithm for Community Detection

Kothapalli, Kishore; Pemmaraju, Sriram V.; Sardeshmukh, Vivek

research

On the Analysis of a Label Propagation Algorithm for Community Detection

Authors: Kishore Kothapalli
Sriram V. Pemmaraju
Vivek Sardeshmukh
Publication date: 13 October 2012
Publisher

Abstract

This paper initiates formal analysis of a simple, distributed algorithm for community detection on networks. We analyze an algorithm that we call \textsc{Max-LPA}, both in terms of its convergence time and in terms of the "quality" of the communities detected. \textsc{Max-LPA} is an instance of a class of community detection algorithms called \textit{label propagation} algorithms. As far as we know, most analysis of label propagation algorithms thus far has been empirical in nature and in this paper we seek a theoretical understanding of label propagation algorithms. In our main result, we define a clustered version of \er random graphs with clusters

V_1, V_2,..., V_k

where the probability

p

, of an edge connecting nodes within a cluster

V_i

is higher than

p'

, the probability of an edge connecting nodes in distinct clusters. We show that even with fairly general restrictions on

p

and

p'

(

p = \Omega(\frac{1}{n^{1/4-\epsilon}})

for any

\epsilon > 0

,

p' = O(p^2)

, where

n

is the number of nodes), \textsc{Max-LPA} detects the clusters

V_1, V_2,..., V_n

in just two rounds. Based on this and on empirical results, we conjecture that \textsc{Max-LPA} can correctly and quickly identify communities on clustered \er graphs even when the clusters are much sparser, i.e., with

p = \frac{c\log n}{n}

for some

c > 1

.Comment: 17 pages. Submitted to ICDCN 201

Similar works

Full text

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.763.5...

Last time updated on 30/10/2017