32,149 research outputs found

    On the Analysis of a Label Propagation Algorithm for Community Detection

    Full text link
    This paper initiates formal analysis of a simple, distributed algorithm for community detection on networks. We analyze an algorithm that we call \textsc{Max-LPA}, both in terms of its convergence time and in terms of the "quality" of the communities detected. \textsc{Max-LPA} is an instance of a class of community detection algorithms called \textit{label propagation} algorithms. As far as we know, most analysis of label propagation algorithms thus far has been empirical in nature and in this paper we seek a theoretical understanding of label propagation algorithms. In our main result, we define a clustered version of \er random graphs with clusters V1,V2,...,VkV_1, V_2,..., V_k where the probability pp, of an edge connecting nodes within a cluster ViV_i is higher than pp', the probability of an edge connecting nodes in distinct clusters. We show that even with fairly general restrictions on pp and pp' (p=Ω(1n1/4ϵ)p = \Omega(\frac{1}{n^{1/4-\epsilon}}) for any ϵ>0\epsilon > 0, p=O(p2)p' = O(p^2), where nn is the number of nodes), \textsc{Max-LPA} detects the clusters V1,V2,...,VnV_1, V_2,..., V_n in just two rounds. Based on this and on empirical results, we conjecture that \textsc{Max-LPA} can correctly and quickly identify communities on clustered \er graphs even when the clusters are much sparser, i.e., with p=clognnp = \frac{c\log n}{n} for some c>1c > 1.Comment: 17 pages. Submitted to ICDCN 201

    Selecting a suitable Parallel Label-propagation based algorithm for Disjoint Community Detection

    Full text link
    Community detection is an essential task in network analysis as it helps identify groups and patterns within a network. High-speed community detection algorithms are necessary to analyze large-scale networks in a reasonable amount of time. Researchers have made significant contributions in the development of high-speed community detection algorithms, particularly in the area of label-propagation based disjoint community detection. These algorithms have been proven to be highly effective in analyzing large-scale networks in a reasonable amount of time. However, it is important to evaluate the performance and accuracy of these existing methods to determine which algorithm is best suited for a particular type of network and specific research problem. In this report, we investigate the RAK, COPRA, and SLPA, three label-propagation-based static community discovery techniques. We pay close attention to each algorithm's minute details as we implement both its single-threaded and multi-threaded OpenMP-based variants, making any necessary adjustments or optimizations and obtaining the right parameter values. The RAK algorithm is found to perform well with a tolerance of 0.05 and OpenMP-based strict RAK with 12 threads was 6.75x faster than the sequential non-strict RAK. The COPRA algorithm works well with a single label for road networks and max labels of 4-16 for other classes of graphs. The SLPA algorithm performs well with increasing memory size, but overall doesn't offer a favourable return on investment. The RAK algorithm is recommended for label-propagation based disjoint community detection.Comment: 11 pages, 1 tabl

    Towards real-time community detection in large networks

    Full text link
    The recent boom of large-scale Online Social Networks (OSNs) both enables and necessitates the use of parallelisable and scalable computational techniques for their analysis. We examine the problem of real-time community detection and a recently proposed linear time - O(m) on a network with m edges - label propagation or "epidemic" community detection algorithm. We identify characteristics and drawbacks of the algorithm and extend it by incorporating different heuristics to facilitate reliable and multifunctional real-time community detection. With limited computational resources, we employ the algorithm on OSN data with 1 million nodes and about 58 million directed edges. Experiments and benchmarks reveal that the extended algorithm is not only faster but its community detection accuracy is compared favourably over popular modularity-gain optimization algorithms known to suffer from their resolution limits.Comment: 10 pages, 11 figure

    Primerjava algoritmov za odkrivanje skupnosti v omrežjih na osnovi izmenjave oznak

    Full text link
    Community structure is an important property of complex networks, since it reveals the organization of the network and relationships between its members. Therefore, the analysis of community structure and development of effective procedures for its detection has been one of the main focuses of network theory. Numerous methods have been proposed for detecting community structure in networks cite{article7}. This thesis presents a heuristic community detection algorithm based on label propagation. Due to its simplicity and low time complexity, label propagation algorithm should be the first option to provide a better understanding of the network community structure before examining other more complex alternatives. We give a brief introduction to graphs and networks, different clustering metrics and related work in the field of network community detection. Next, we present the basic approach of label propagation algorithm, discuss advantages and disadvantages, and review extensions of the method, focusing mainly on consensus clustering and fast consensus clustering. The aforementioned algorithms are implemented in a Python programming library, which is available at: url{https://github.com/damir1407/label-propagation}. Furthermore, we evaluate these three network clustering methods on different synthetic and real-world networks, and present the results. The thesis is concluded with a summary of the presented methods and directions for future work.Struktura skupnosti je pomembna lastnost kompleksnih omrežij, saj razkrije organizacijo omrežja in razmerja med njegovimi člani. Zato sta analiza skupnosti in razvoj učinkovitih načinov za njihovo odkrivanje dva izmed pomembnih žarišč teorije omrežij. V literaturi so predlagani številni načini za odkrivanje strukture skupnosti v omrežjih~cite{article7}. V tej diplomski nalogi je predstavljen hevrističen algoritem za odkrivanje skupnosti, ki temelji na izmenjavi oznak. Zaradi njegove enostavnosti in nizke časovne zahtevnosti, bi moral biti algoritem za izmenjavo oznak prva izbira pri zagotavljanju boljšega razumevanja strukture skupnosti v omrežjih, pred proučevanjem drugih, bolj zapletenih alternativ. Začnemo s kratkim uvodom v grafe in omrežja, različne metrike gručenja in raziskave, ki se nanašajo na področje odkrivanja skupnosti v omrežjih. Potem predstavimo osnovne pristope algoritma za izmenjavo oznak, razpravljamo o njegovih prednostih in pomanjkljivostih, ter pregledamo razširitve metode s poudarkom na konsenznem gručenju in hitrem konsenznem gručenju. Zgoraj omenjeni algoritmi so implementirani v programski knjižnici za Python, ki je na voljo na: url{https://github.com/damir1407/label-propagation}. V nadaljevanju ocenimo učinkovitost teh treh metod gručenja omrežij na različnih sintetičnih in resničnih omrežjih ter predstavimo rezultate. Diplomsko nalogo zaključimo s povzetkom predstavljenih metod in predlogi za prihodnje delo

    A self-organizing algorithm for community structure analysis in complex networks

    Get PDF
    Community structure analysis is a critical task for complex network analysis. It helps us to understand the properties of the system that a complex network represents, and has significance to a wide range of real applications. The Label Propagation Algorithm (LPA) is currently the most popular community structure analysis algorithm due to its near linear time complexity. However, the performance of the LPA has proven to be unstable and the correctness of community assignment of nodes is unsatisfactory. In this paper a Self-Organizing Community Detection and Analytic Algorithm (SOCDA2) based on swarm intelligence is proposed. In the algorithm, a network is modeled as a swarm intelligence system, while each node within the network acts iteratively to join or leave communities based on a set of pre-defined node action rules, in order to improve the quality of the communities. When there is not a node changing its belonging community anymore, an optimal community structure will emerge as a result. A variety of experiments conducted on both synthesized and real-world networks have shown results which indicate that the proposed algorithm can effectively detect community structures and the performance is better than that of the LPA. In addition, the algorithm can be extended for overlapping community detection and be parallelized for largescale network analysis

    Distributed Community Detection via Metastability of the 2-Choices Dynamics

    Get PDF
    We investigate the behavior of a simple majority dynamics on networks of agents whose interaction topology exhibits a community structure. By leveraging recent advancements in the analysis of dynamics, we prove that, when the states of the nodes are randomly initialized, the system rapidly and stably converges to a configuration in which the communities maintain internal consensus on different states. This is the first analytical result on the behavior of dynamics for non-consensus problems on non-complete topologies, based on the first symmetry-breaking analysis in such setting. Our result has several implications in different contexts in which dynamics are adopted for computational and biological modeling purposes. In the context of Label Propagation Algorithms, a class of widely used heuristics for community detection, it represents the first theoretical result on the behavior of a distributed label propagation algorithm with quasi-linear message complexity. In the context of evolutionary biology, dynamics such as the Moran process have been used to model the spread of mutations in genetic populations [Lieberman, Hauert, and Nowak 2005]; our result shows that, when the probability of adoption of a given mutation by a node of the evolutionary graph depends super-linearly on the frequency of the mutation in the neighborhood of the node and the underlying evolutionary graph exhibits a community structure, there is a non-negligible probability for species differentiation to occur.Comment: Full version of paper appeared in AAAI-1
    corecore