Search CORE

111,934 research outputs found

Distributed Graph Clustering using Modularity and Map Equation

Author: A Lancichinetti
BH Good
C Staudt
DA Bader
G Karypis
J Zeng
L Hubert
M Rosvall
MEJ Newman
S Bae
S Fortunato
S Fortunato
S Fortunato
T Kawamoto
U Brandes
Vincent D Blondel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/06/2018
Field of study

We study large-scale, distributed graph clustering. Given an undirected graph, our objective is to partition the nodes into disjoint sets called clusters. A cluster should contain many internal edges while being sparsely connected to other clusters. In the context of a social network, a cluster could be a group of friends. Modularity and map equation are established formalizations of this internally-dense-externally-sparse principle. We present two versions of a simple distributed algorithm to optimize both measures. They are based on Thrill, a distributed big data processing framework that implements an extended MapReduce model. The algorithms for the two measures, DSLM-Mod and DSLM-Map, differ only slightly. Adapting them for similar quality measures is straight-forward. We conduct an extensive experimental study on real-world graphs and on synthetic benchmark graphs with up to 68 billion edges. Our algorithms are fast while detecting clusterings similar to those detected by other sequential, parallel and distributed clustering algorithms. Compared to the distributed GossipMap algorithm, DSLM-Map needs less memory, is up to an order of magnitude faster and achieves better quality.Comment: 14 pages, 3 figures; v3: Camera ready for Euro-Par 2018, more details, more results; v2: extended experiments to include comparison with competing algorithms, shortened for submission to Euro-Par 201

arXiv.org e-Print Archive

Crossref

Space and Time Efficient Parallel Graph Decomposition, Clustering, and Diameter Approximation

Author: Ceccarello Matteo
Pietracaprina Andrea
Pucci Geppino
Upfal Eli
Publication venue
Publication date: 01/01/2015
Field of study

We develop a novel parallel decomposition strategy for unweighted, undirected graphs, based on growing disjoint connected clusters from batches of centers progressively selected from yet uncovered nodes. With respect to similar previous decompositions, our strategy exercises a tighter control on both the number of clusters and their maximum radius. We present two important applications of our parallel graph decomposition: (1)

k

-center clustering approximation; and (2) diameter approximation. In both cases, we obtain algorithms which feature a polylogarithmic approximation factor and are amenable to a distributed implementation that is geared for massive (long-diameter) graphs. The total space needed for the computation is linear in the problem size, and the parallel depth is substantially sublinear in the diameter for graphs with low doubling dimension. To the best of our knowledge, ours are the first parallel approximations for these problems which achieve sub-diameter parallel time, for a relevant class of graphs, using only linear space. Besides the theoretical guarantees, our algorithms allow for a very simple implementation on clustered architectures: we report on extensive experiments which demonstrate their effectiveness and efficiency on large graphs as compared to alternative known approaches.Comment: 14 page

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Padova

Efficient Distributed Clustering Algorithms on Star-Schema Heterogeneous Graphs

Author: Chen Lu
Gao Yunjun
Huang Xingrui
Jensen Christian S.
Zheng Bolong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/12/2020
Field of study

VBN