8,150 research outputs found
An Ensemble Framework for Detecting Community Changes in Dynamic Networks
Dynamic networks, especially those representing social networks, undergo
constant evolution of their community structure over time. Nodes can migrate
between different communities, communities can split into multiple new
communities, communities can merge together, etc. In order to represent dynamic
networks with evolving communities it is essential to use a dynamic model
rather than a static one. Here we use a dynamic stochastic block model where
the underlying block model is different at different times. In order to
represent the structural changes expressed by this dynamic model the network
will be split into discrete time segments and a clustering algorithm will
assign block memberships for each segment. In this paper we show that using an
ensemble of clustering assignments accommodates for the variance in scalable
clustering algorithms and produces superior results in terms of
pairwise-precision and pairwise-recall. We also demonstrate that the dynamic
clustering produced by the ensemble can be visualized as a flowchart which
encapsulates the community evolution succinctly.Comment: 6 pages, under submission to HPEC Graph Challeng
Spectral redemption: clustering sparse networks
Spectral algorithms are classic approaches to clustering and community
detection in networks. However, for sparse networks the standard versions of
these algorithms are suboptimal, in some cases completely failing to detect
communities even when other algorithms such as belief propagation can do so.
Here we introduce a new class of spectral algorithms based on a
non-backtracking walk on the directed edges of the graph. The spectrum of this
operator is much better-behaved than that of the adjacency matrix or other
commonly used matrices, maintaining a strong separation between the bulk
eigenvalues and the eigenvalues relevant to community structure even in the
sparse case. We show that our algorithm is optimal for graphs generated by the
stochastic block model, detecting communities all the way down to the
theoretical limit. We also show the spectrum of the non-backtracking operator
for some real-world networks, illustrating its advantages over traditional
spectral clustering.Comment: 11 pages, 6 figures. Clarified to what extent our claims are
rigorous, and to what extent they are conjectures; also added an
interpretation of the eigenvectors of the 2n-dimensional version of the
non-backtracking matri
Online Tensor Methods for Learning Latent Variable Models
We introduce an online tensor decomposition based approach for two latent
variable modeling problems namely, (1) community detection, in which we learn
the latent communities that the social actors in social networks belong to, and
(2) topic modeling, in which we infer hidden topics of text articles. We
consider decomposition of moment tensors using stochastic gradient descent. We
conduct optimization of multilinear operations in SGD and avoid directly
forming the tensors, to save computational and storage costs. We present
optimized algorithm in two platforms. Our GPU-based implementation exploits the
parallelism of SIMD architectures to allow for maximum speed-up by a careful
optimization of storage and data transfer, whereas our CPU-based implementation
uses efficient sparse matrix computations and is suitable for large sparse
datasets. For the community detection problem, we demonstrate accuracy and
computational efficiency on Facebook, Yelp and DBLP datasets, and for the topic
modeling problem, we also demonstrate good performance on the New York Times
dataset. We compare our results to the state-of-the-art algorithms such as the
variational method, and report a gain of accuracy and a gain of several orders
of magnitude in the execution time.Comment: JMLR 201
Scalable and Robust Community Detection with Randomized Sketching
This paper explores and analyzes the unsupervised clustering of large
partially observed graphs. We propose a scalable and provable randomized
framework for clustering graphs generated from the stochastic block model. The
clustering is first applied to a sub-matrix of the graph's adjacency matrix
associated with a reduced graph sketch constructed using random sampling. Then,
the clusters of the full graph are inferred based on the clusters extracted
from the sketch using a correlation-based retrieval step. Uniform random node
sampling is shown to improve the computational complexity over clustering of
the full graph when the cluster sizes are balanced. A new random degree-based
node sampling algorithm is presented which significantly improves upon the
performance of the clustering algorithm even when clusters are unbalanced. This
algorithm improves the phase transitions for matrix-decomposition-based
clustering with regard to computational complexity and minimum cluster size,
which are shown to be nearly dimension-free in the low inter-cluster
connectivity regime. A third sampling technique is shown to improve balance by
randomly sampling nodes based on spatial distribution. We provide analysis and
numerical results using a convex clustering algorithm based on matrix
completion
- …