Search CORE

187 research outputs found

Distributed Graph Clustering using Modularity and Map Equation

Author: A Lancichinetti
BH Good
C Staudt
DA Bader
G Karypis
J Zeng
L Hubert
M Rosvall
MEJ Newman
S Bae
S Fortunato
S Fortunato
S Fortunato
T Kawamoto
U Brandes
Vincent D Blondel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/06/2018
Field of study

We study large-scale, distributed graph clustering. Given an undirected graph, our objective is to partition the nodes into disjoint sets called clusters. A cluster should contain many internal edges while being sparsely connected to other clusters. In the context of a social network, a cluster could be a group of friends. Modularity and map equation are established formalizations of this internally-dense-externally-sparse principle. We present two versions of a simple distributed algorithm to optimize both measures. They are based on Thrill, a distributed big data processing framework that implements an extended MapReduce model. The algorithms for the two measures, DSLM-Mod and DSLM-Map, differ only slightly. Adapting them for similar quality measures is straight-forward. We conduct an extensive experimental study on real-world graphs and on synthetic benchmark graphs with up to 68 billion edges. Our algorithms are fast while detecting clusterings similar to those detected by other sequential, parallel and distributed clustering algorithms. Compared to the distributed GossipMap algorithm, DSLM-Map needs less memory, is up to an order of magnitude faster and achieves better quality.Comment: 14 pages, 3 figures; v3: Camera ready for Euro-Par 2018, more details, more results; v2: extended experiments to include comparison with competing algorithms, shortened for submission to Euro-Par 201

arXiv.org e-Print Archive

Crossref

Generating realistic scaled complex networks

Author: Gutfraind Alexander
Hamann Michael
Meyerhenke Henning
Safro Ilya
Staudt Christian L.
Publication venue
Publication date: 23/03/2017
Field of study

Research on generative models is a central project in the emerging field of network science, and it studies how statistical patterns found in real networks could be generated by formal rules. Output from these generative models is then the basis for designing and evaluating computational methods on networks, and for verification and simulation studies. During the last two decades, a variety of models has been proposed with an ultimate goal of achieving comprehensive realism for the generated networks. In this study, we (a) introduce a new generator, termed ReCoN; (b) explore how ReCoN and some existing models can be fitted to an original network to produce a structurally similar replica, (c) use ReCoN to produce networks much larger than the original exemplar, and finally (d) discuss open problems and promising research directions. In a comparative experimental study, we find that ReCoN is often superior to many other state-of-the-art network generation methods. We argue that ReCoN is a scalable and effective tool for modeling a given network while preserving important properties at both micro- and macroscopic scales, and for scaling the exemplar data by orders of magnitude in size.Comment: 26 pages, 13 figures, extended version, a preliminary version of the paper was presented at the 5th International Workshop on Complex Networks and their Application

arXiv.org e-Print Archive

Crossref

KITopen

Directory of Open Access Journals

Scalable Community Detection

Author: Hamann Michael Alexander
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 26/05/2021
Field of study

KITopen

Parallel and I/O-efficient randomisation of massive networks using global curveball trades

Author: Carstens C. J.
Hamann M.
Meyer U.
Penschuck M.
Tran H.
Wagner D.
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH
Publication date: 01/01/2018
Field of study

Graph randomisation is a crucial task in the analysis and synthesis of networks. It is typically implemented as an edge switching process (ESMC) repeatedly swapping the nodes of random edge pairs while maintaining the degrees involved [23]. Curveball is a novel approach that instead considers the whole neighbourhoods of randomly drawn node pairs. Its Markov chain converges to a uniform distribution, and experiments suggest that it requires less steps than the established ESMC [6]. Since trades however are more expensive, we study Curveball’s practical runtime by introducing the first efficient Curveball algorithms: the I/O-efficient EM-CB for simple undirected graphs and its internal memory pendant IM-CB. Further, we investigate global trades [6] processing every node in a single super step, and show that undirected global trades converge to a uniform distribution and perform superior in practice. We then discuss EM-GCB and EMPGCB for global trades and give experimental evidence that EM-PGCB achieves the quality of the state-of-the-art ESMC algorithm EM-ES [15] nearly one order of magnitude faster

arXiv.org e-Print Archive

KITopen

Dagstuhl Research Online Publication Server

Algorithms and Software for the Analysis of Large Complex Networks

Author: Staudt Christian Lorenz
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2016
Field of study

The work presented intersects three main areas, namely graph algorithmics, network science and applied software engineering. Each computational method discussed relates to one of the main tasks of data analysis: to extract structural features from network data, such as methods for community detection; or to transform network data, such as methods to sparsify a network and reduce its size while keeping essential properties; or to realistically model networks through generative models

KITopen

Fast community structure local uncovering by independent vertex-centred process

Author: Canu Maël
d'Allonnes Adrien Revault
Detyniecki Marcin
Lesot Marie-Jeanne
Publication venue
Publication date: 25/08/2015
Field of study

This paper addresses the task of community detection and proposes a local approach based on a distributed list building, where each vertex broadcasts basic information that only depends on its degree and that of its neighbours. A decentralised external process then unveils the community structure. The relevance of the proposed method is experimentally shown on both artificial and real data.Comment: 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Aug 2015, Paris, France. Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Minin

arXiv.org e-Print Archive

Crossref