7 research outputs found

    On the Analysis of a Label Propagation Algorithm for Community Detection

    Full text link
    This paper initiates formal analysis of a simple, distributed algorithm for community detection on networks. We analyze an algorithm that we call \textsc{Max-LPA}, both in terms of its convergence time and in terms of the "quality" of the communities detected. \textsc{Max-LPA} is an instance of a class of community detection algorithms called \textit{label propagation} algorithms. As far as we know, most analysis of label propagation algorithms thus far has been empirical in nature and in this paper we seek a theoretical understanding of label propagation algorithms. In our main result, we define a clustered version of \er random graphs with clusters V1,V2,...,VkV_1, V_2,..., V_k where the probability pp, of an edge connecting nodes within a cluster ViV_i is higher than pp', the probability of an edge connecting nodes in distinct clusters. We show that even with fairly general restrictions on pp and pp' (p=Ω(1n1/4ϵ)p = \Omega(\frac{1}{n^{1/4-\epsilon}}) for any ϵ>0\epsilon > 0, p=O(p2)p' = O(p^2), where nn is the number of nodes), \textsc{Max-LPA} detects the clusters V1,V2,...,VnV_1, V_2,..., V_n in just two rounds. Based on this and on empirical results, we conjecture that \textsc{Max-LPA} can correctly and quickly identify communities on clustered \er graphs even when the clusters are much sparser, i.e., with p=clognnp = \frac{c\log n}{n} for some c>1c > 1.Comment: 17 pages. Submitted to ICDCN 201

    Super-Fast MST Algorithms in the Congested Clique Using o(m) Messages

    Get PDF
    In a sequence of recent results (PODC 2015 and PODC 2016), the running time of the fastest algorithm for the minimum spanning tree (MST) problem in the Congested Clique model was first improved to O(log(log(log(n)))) from O(log(log(n))) (Hegeman et al., PODC 2015) and then to O(log^*(n)) (Ghaffari and Parter, PODC 2016). All of these algorithms use Theta(n^2) messages independent of the number of edges in the input graph. This paper positively answers a question raised in Hegeman et al., and presents the first "super-fast" MST algorithm with o(m) message complexity for input graphs with m edges. Specifically, we present an algorithm running in O(log^*(n)) rounds, with message complexity ~O(sqrt{m * n}) and then build on this algorithm to derive a family of algorithms, containing for any epsilon, 0 < epsilon <= 1, an algorithm running in O(log^*(n)/epsilon) rounds, using ~O(n^{1 + epsilon}/epsilon) messages. Setting epsilon = log(log(n))/log(n) leads to the first sub-logarithmic round Congested Clique MST algorithm that uses only ~O(n) messages. Our primary tools in achieving these results are (i) a component-wise bound on the number of candidates for MST edges, extending the sampling lemma of Karger, Klein, and Tarjan (Karger, Klein, and Tarjan, JACM 1995) and (ii) Theta(log(n))-wise-independent linear graph sketches (Cormode and Firmani, Dist. Par. Databases, 2014) for generating MST candidate edges

    Task-based Parallel Computation of the Density Matrix in Quantum-based Molecular Dynamics using Graph Partitioning

    Get PDF
    Quantum-based molecular dynamics (QMD) is a highly accurate and transferable method for material science simulations. However, the time scales and system sizes accessible to QMD are typically limited to picoseconds and a few hundred atoms. These constraints arise due to expensive self-consistent ground-state electronic structure calculations that can often scale cubically with the number of atoms. Linearly scaling methods depend on computing the density matrix P from the Hamiltonian matrix H by exploiting the sparsity in both matrices. The second-order spectral projection (SP2) algorithm is an O(N) algorithm that computes P with a sequence of 40-50 matrix-matrix multiplications. In this paper, we present task-based implementations of a recently developed data-parallel graph-based approach to the SP2 algorithm, G-SP2. We represent the density matrix P as an undirected graph and use graph partitioning techniques to divide the computation into smaller independent tasks. The partitions thus obtained are generally not of equal size and give rise to undesirable load imbalances in standard MPI-based implementations. This load-balancing challenge can be mitigated by dynamically scheduling parallel computations at runtime using task-based programming models. We develop task-based implementations of the data-parallel G-SP2 algorithm using both Intel's Concurrent Collections (CnC) as well as the Charm++ programming model and evaluate these implementations for future use. Scaling and performance results of our implementations are investigated for representative segments of QMD simulations for solvated protein systems containing more than 10,000 atoms

    Using graph partitioning for scalable distributed quantum molecular dynamics

    Get PDF
    The simulation of the physical movement of multi-body systems at an atomistic level, with forces calculated from a quantum mechanical description of the electrons, motivates a graph partitioning problem studied in this article. Several advanced algorithms relying on evaluations of matrix polynomials have been published in the literature for such simulations. We aim to use a special type of graph partitioning to efficiently parallelize these computations. For this, we create a graph representing the zero–nonzero structure of a thresholded density matrix, and partition that graph into several components. Each separate submatrix (corresponding to each subgraph) is then substituted into the matrix polynomial, and the result for the full matrix polynomial is reassembled at the end from the individual polynomials. This paper starts by introducing a rigorous definition as well as a mathematical justification of this partitioning problem. We assess the performance of several methods to compute graph partitions with respect to both the quality of the partitioning and their runtime

    Graph Partitioning Methods for Fast Parallel Quantum Molecular Dynamics

    No full text
    We study a graph partitioning problem motivated by the simulation of the physical movement of multi-body systems on an atomistic level, where the forces are calculated from a quantum mechanical description of the electrons. Several advanced algorithms have been published in the literature for such simulations that are based on evaluations of matrix polynomials. We aim at efficiently parallelizing these computations by using a special type of graph partitioning. For this, we represent the zero-nonzero structure of a thresholded matrix as a graph and partition that graph into several components. The matrix polynomial is then evaluated for each separate submatrix corresponding to the subgraphs and the evaluated submatrix polynomials are used to assemble the final result for the full matrix polynomial. The paper provides a rigorous definition as well as a mathematical justification of this partitioning problem. We use several algorithms to compute graph partitions and experimentally evaluate their performance with respect to the quality of the partition obtained with each method and the time needed to produce it
    corecore