4,136 research outputs found
Fast Detection of Community Structures using Graph Traversal in Social Networks
Finding community structures in social networks is considered to be a
challenging task as many of the proposed algorithms are computationally
expensive and does not scale well for large graphs. Most of the community
detection algorithms proposed till date are unsuitable for applications that
would require detection of communities in real-time, especially for massive
networks. The Louvain method, which uses modularity maximization to detect
clusters, is usually considered to be one of the fastest community detection
algorithms even without any provable bound on its running time. We propose a
novel graph traversal-based community detection framework, which not only runs
faster than the Louvain method but also generates clusters of better quality
for most of the benchmark datasets. We show that our algorithms run in O(|V | +
|E|) time to create an initial cover before using modularity maximization to
get the final cover.
Keywords - community detection; Influenced Neighbor Score; brokers; community
nodes; communitiesComment: 29 pages, 9 tables, and 13 figures. Accepted in "Knowledge and
Information Systems", 201
Distributed Community Detection in Dynamic Graphs
Inspired by the increasing interest in self-organizing social opportunistic
networks, we investigate the problem of distributed detection of unknown
communities in dynamic random graphs. As a formal framework, we consider the
dynamic version of the well-studied \emph{Planted Bisection Model}
\sdG(n,p,q) where the node set of the network is partitioned into two
unknown communities and, at every time step, each possible edge is
active with probability if both nodes belong to the same community, while
it is active with probability (with ) otherwise. We also consider a
time-Markovian generalization of this model.
We propose a distributed protocol based on the popular \emph{Label
Propagation Algorithm} and prove that, when the ratio is larger than
(for an arbitrarily small constant ), the protocol finds the right
"planted" partition in time even when the snapshots of the dynamic
graph are sparse and disconnected (i.e. in the case ).Comment: Version I
Community detection and stochastic block models: recent developments
The stochastic block model (SBM) is a random graph model with planted
clusters. It is widely employed as a canonical model to study clustering and
community detection, and provides generally a fertile ground to study the
statistical and computational tradeoffs that arise in network and data
sciences.
This note surveys the recent developments that establish the fundamental
limits for community detection in the SBM, both with respect to
information-theoretic and computational thresholds, and for various recovery
requirements such as exact, partial and weak recovery (a.k.a., detection). The
main results discussed are the phase transitions for exact recovery at the
Chernoff-Hellinger threshold, the phase transition for weak recovery at the
Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial
recovery, the learning of the SBM parameters and the gap between
information-theoretic and computational thresholds.
The note also covers some of the algorithms developed in the quest of
achieving the limits, in particular two-round algorithms via graph-splitting,
semi-definite programming, linearized belief propagation, classical and
nonbacktracking spectral methods. A few open problems are also discussed
Global and Local Information in Clustering Labeled Block Models
The stochastic block model is a classical cluster-exhibiting random graph
model that has been widely studied in statistics, physics and computer science.
In its simplest form, the model is a random graph with two equal-sized
clusters, with intra-cluster edge probability p, and inter-cluster edge
probability q. We focus on the sparse case, i.e., p, q = O(1/n), which is
practically more relevant and also mathematically more challenging. A
conjecture of Decelle, Krzakala, Moore and Zdeborova, based on ideas from
statistical physics, predicted a specific threshold for clustering. The
negative direction of the conjecture was proved by Mossel, Neeman and Sly
(2012), and more recently the positive direction was proven independently by
Massoulie and Mossel, Neeman, and Sly.
In many real network clustering problems, nodes contain information as well.
We study the interplay between node and network information in clustering by
studying a labeled block model, where in addition to the edge information, the
true cluster labels of a small fraction of the nodes are revealed. In the case
of two clusters, we show that below the threshold, a small amount of node
information does not affect recovery. On the other hand, we show that for any
small amount of information efficient local clustering is achievable as long as
the number of clusters is sufficiently large (as a function of the amount of
revealed information).Comment: 24 pages, 2 figures. A short abstract describing these results will
appear in proceedings of RANDOM 201
- …