30,002 research outputs found

    Distributed Graph Clustering using Modularity and Map Equation

    Full text link
    We study large-scale, distributed graph clustering. Given an undirected graph, our objective is to partition the nodes into disjoint sets called clusters. A cluster should contain many internal edges while being sparsely connected to other clusters. In the context of a social network, a cluster could be a group of friends. Modularity and map equation are established formalizations of this internally-dense-externally-sparse principle. We present two versions of a simple distributed algorithm to optimize both measures. They are based on Thrill, a distributed big data processing framework that implements an extended MapReduce model. The algorithms for the two measures, DSLM-Mod and DSLM-Map, differ only slightly. Adapting them for similar quality measures is straight-forward. We conduct an extensive experimental study on real-world graphs and on synthetic benchmark graphs with up to 68 billion edges. Our algorithms are fast while detecting clusterings similar to those detected by other sequential, parallel and distributed clustering algorithms. Compared to the distributed GossipMap algorithm, DSLM-Map needs less memory, is up to an order of magnitude faster and achieves better quality.Comment: 14 pages, 3 figures; v3: Camera ready for Euro-Par 2018, more details, more results; v2: extended experiments to include comparison with competing algorithms, shortened for submission to Euro-Par 201

    Generating realistic scaled complex networks

    Get PDF
    Research on generative models is a central project in the emerging field of network science, and it studies how statistical patterns found in real networks could be generated by formal rules. Output from these generative models is then the basis for designing and evaluating computational methods on networks, and for verification and simulation studies. During the last two decades, a variety of models has been proposed with an ultimate goal of achieving comprehensive realism for the generated networks. In this study, we (a) introduce a new generator, termed ReCoN; (b) explore how ReCoN and some existing models can be fitted to an original network to produce a structurally similar replica, (c) use ReCoN to produce networks much larger than the original exemplar, and finally (d) discuss open problems and promising research directions. In a comparative experimental study, we find that ReCoN is often superior to many other state-of-the-art network generation methods. We argue that ReCoN is a scalable and effective tool for modeling a given network while preserving important properties at both micro- and macroscopic scales, and for scaling the exemplar data by orders of magnitude in size.Comment: 26 pages, 13 figures, extended version, a preliminary version of the paper was presented at the 5th International Workshop on Complex Networks and their Application

    Computing Vertex Centrality Measures in Massive Real Networks with a Neural Learning Model

    Full text link
    Vertex centrality measures are a multi-purpose analysis tool, commonly used in many application environments to retrieve information and unveil knowledge from the graphs and network structural properties. However, the algorithms of such metrics are expensive in terms of computational resources when running real-time applications or massive real world networks. Thus, approximation techniques have been developed and used to compute the measures in such scenarios. In this paper, we demonstrate and analyze the use of neural network learning algorithms to tackle such task and compare their performance in terms of solution quality and computation time with other techniques from the literature. Our work offers several contributions. We highlight both the pros and cons of approximating centralities though neural learning. By empirical means and statistics, we then show that the regression model generated with a feedforward neural networks trained by the Levenberg-Marquardt algorithm is not only the best option considering computational resources, but also achieves the best solution quality for relevant applications and large-scale networks. Keywords: Vertex Centrality Measures, Neural Networks, Complex Network Models, Machine Learning, Regression ModelComment: 8 pages, 5 tables, 2 figures, version accepted at IJCNN 2018. arXiv admin note: text overlap with arXiv:1810.1176

    Parallel Toolkit for Measuring the Quality of Network Community Structure

    Full text link
    Many networks display community structure which identifies groups of nodes within which connections are denser than between them. Detecting and characterizing such community structure, which is known as community detection, is one of the fundamental issues in the study of network systems. It has received a considerable attention in the last years. Numerous techniques have been developed for both efficient and effective community detection. Among them, the most efficient algorithm is the label propagation algorithm whose computational complexity is O(|E|). Although it is linear in the number of edges, the running time is still too long for very large networks, creating the need for parallel community detection. Also, computing community quality metrics for community structure is computationally expensive both with and without ground truth. However, to date we are not aware of any effort to introduce parallelism for this problem. In this paper, we provide a parallel toolkit to calculate the values of such metrics. We evaluate the parallel algorithms on both distributed memory machine and shared memory machine. The experimental results show that they yield a significant performance gain over sequential execution in terms of total running time, speedup, and efficiency.Comment: 8 pages; in Network Intelligence Conference (ENIC), 2014 Europea
    • …
    corecore