Search CORE

7 research outputs found

On the Analysis of a Label Propagation Algorithm for Community Detection

Author: Kothapalli Kishore
Pemmaraju Sriram V.
Sardeshmukh Vivek
Publication venue
Publication date: 13/10/2012
Field of study

This paper initiates formal analysis of a simple, distributed algorithm for community detection on networks. We analyze an algorithm that we call \textsc{Max-LPA}, both in terms of its convergence time and in terms of the "quality" of the communities detected. \textsc{Max-LPA} is an instance of a class of community detection algorithms called \textit{label propagation} algorithms. As far as we know, most analysis of label propagation algorithms thus far has been empirical in nature and in this paper we seek a theoretical understanding of label propagation algorithms. In our main result, we define a clustered version of \er random graphs with clusters

V_1, V_2,..., V_k

where the probability

p

, of an edge connecting nodes within a cluster

V_i

is higher than

p'

, the probability of an edge connecting nodes in distinct clusters. We show that even with fairly general restrictions on

p

and

p'

(

p = \Omega(\frac{1}{n^{1/4-\epsilon}})

for any

\epsilon > 0

p' = O(p^2)

, where

n

is the number of nodes), \textsc{Max-LPA} detects the clusters

V_1, V_2,..., V_n

in just two rounds. Based on this and on empirical results, we conjecture that \textsc{Max-LPA} can correctly and quickly identify communities on clustered \er graphs even when the clusters are much sparser, i.e., with

p = \frac{c\log n}{n}

for some

c > 1

.Comment: 17 pages. Submitted to ICDCN 201

arXiv.org e-Print Archive

CiteSeerX

Super-Fast MST Algorithms in the Congested Clique Using o(m) Messages

Author: Pemmaraju Sriram V.
Sardeshmukh Vivek B.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 36th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2016)
Publication date: 01/01/2016
Field of study

In a sequence of recent results (PODC 2015 and PODC 2016), the running time of the fastest algorithm for the minimum spanning tree (MST) problem in the Congested Clique model was first improved to O(log(log(log(n)))) from O(log(log(n))) (Hegeman et al., PODC 2015) and then to O(log^*(n)) (Ghaffari and Parter, PODC 2016). All of these algorithms use Theta(n^2) messages independent of the number of edges in the input graph. This paper positively answers a question raised in Hegeman et al., and presents the first "super-fast" MST algorithm with o(m) message complexity for input graphs with m edges. Specifically, we present an algorithm running in O(log^*(n)) rounds, with message complexity ~O(sqrt{m * n}) and then build on this algorithm to derive a family of algorithms, containing for any epsilon, 0 < epsilon <= 1, an algorithm running in O(log^*(n)/epsilon) rounds, using ~O(n^{1 + epsilon}/epsilon) messages. Setting epsilon = log(log(n))/log(n) leads to the first sub-logarithmic round Congested Clique MST algorithm that uses only ~O(n) messages. Our primary tools in achieving these results are (i) a component-wise bound on the number of candidates for MST edges, extending the sampling lemma of Karger, Klein, and Tarjan (Karger, Klein, and Tarjan, JACM 1995) and (ii) Theta(log(n))-wise-independent linear graph sketches (Cormode and Firmani, Dist. Par. Databases, 2014) for generating MST candidate edges

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Task-based Parallel Computation of the Density Matrix in Quantum-based Molecular Dynamics using Graph Partitioning

Author: Ghale Purnima
Hahn Georg
Kroonblawd Matthew P.
Mniszewski Sue
Negre Christian F. A.
Pavel Robert
Pino Sergio
Sardeshmukh Vivek
Shi Guangjie
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2017
Field of study

Quantum-based molecular dynamics (QMD) is a highly accurate and transferable method for material science simulations. However, the time scales and system sizes accessible to QMD are typically limited to picoseconds and a few hundred atoms. These constraints arise due to expensive self-consistent ground-state electronic structure calculations that can often scale cubically with the number of atoms. Linearly scaling methods depend on computing the density matrix P from the Hamiltonian matrix H by exploiting the sparsity in both matrices. The second-order spectral projection (SP2) algorithm is an O(N) algorithm that computes P with a sequence of 40-50 matrix-matrix multiplications. In this paper, we present task-based implementations of a recently developed data-parallel graph-based approach to the SP2 algorithm, G-SP2. We represent the density matrix P as an undirected graph and use graph partitioning techniques to divide the computation into smaller independent tasks. The partitions thus obtained are generally not of equal size and give rise to undesirable load imbalances in standard MPI-based implementations. This load-balancing challenge can be mitigated by dynamically scheduling parallel computations at runtime using task-based programming models. We develop task-based implementations of the data-parallel G-SP2 algorithm using both Intel's Concurrent Collections (CnC) as well as the Charm++ programming model and evaluate these implementations for future use. Scaling and performance results of our implementations are investigated for representative segments of QMD simulations for solvated protein systems containing more than 10,000 atoms

arXiv.org e-Print Archive

Crossref

Lancaster E-Prints

Using graph partitioning for scalable distributed quantum molecular dynamics

Author: Djidjev Hristo N.
Hahn Georg
Mniszewski Susan M.
Negre Christian F. A.
Niklasson Anders M. N.
Sardeshmukh Vivek B.
Publication venue: 'MDPI AG'
Publication date: 26/06/2019
Field of study

The simulation of the physical movement of multi-body systems at an atomistic level, with forces calculated from a quantum mechanical description of the electrons, motivates a graph partitioning problem studied in this article. Several advanced algorithms relying on evaluations of matrix polynomials have been published in the literature for such simulations. We aim to use a special type of graph partitioning to efficiently parallelize these computations. For this, we create a graph representing the zero–nonzero structure of a thresholded density matrix, and partition that graph into several components. Each separate submatrix (corresponding to each subgraph) is then substituted into the matrix polynomial, and the result for the full matrix polynomial is reassembled at the end from the individual polynomials. This paper starts by introducing a rigorous definition as well as a mathematical justification of this partitioning problem. We assess the performance of several methods to compute graph partitions with respect to both the quality of the partitioning and their runtime

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Lancaster E-Prints

Graph Partitioning Methods for Fast Parallel Quantum Molecular Dynamics

Author: Djidjev Hristo N.
Hahn Georg
Mniszewski Susan M.
Negre Christian F. A.
Niklasson Anders M. N.
Sardeshmukh Vivek B.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 10/10/2016
Field of study

We study a graph partitioning problem motivated by the simulation of the physical movement of multi-body systems on an atomistic level, where the forces are calculated from a quantum mechanical description of the electrons. Several advanced algorithms have been published in the literature for such simulations that are based on evaluations of matrix polynomials. We aim at efficiently parallelizing these computations by using a special type of graph partitioning. For this, we represent the zero-nonzero structure of a thresholded matrix as a graph and partition that graph into several components. The matrix polynomial is then evaluated for each separate submatrix corresponding to the subgraphs and the evaluated submatrix polynomials are used to assemble the final result for the full matrix polynomial. The paper provides a rigorous definition as well as a mathematical justification of this partitioning problem. We use several algorithms to compute graph partitions and experimentally evaluate their performance with respect to the quality of the partition obtained with each method and the time needed to produce it

Crossref

Lancaster E-Prints