20,496 research outputs found

    On Efficiently Detecting Overlapping Communities over Distributed Dynamic Graphs

    Full text link
    Modern networks are of huge sizes as well as high dynamics, which challenges the efficiency of community detection algorithms. In this paper, we study the problem of overlapping community detection on distributed and dynamic graphs. Given a distributed, undirected and unweighted graph, the goal is to detect overlapping communities incrementally as the graph is dynamically changing. We propose an efficient algorithm, called \textit{randomized Speaker-Listener Label Propagation Algorithm} (rSLPA), based on the \textit{Speaker-Listener Label Propagation Algorithm} (SLPA) by relaxing the probability distribution of label propagation. Besides detecting high-quality communities, rSLPA can incrementally update the detected communities after a batch of edge insertion and deletion operations. To the best of our knowledge, rSLPA is the first algorithm that can incrementally capture the same communities as those obtained by applying the detection algorithm from the scratch on the updated graph. Extensive experiments are conducted on both synthetic and real-world datasets, and the results show that our algorithm can achieve high accuracy and efficiency at the same time.Comment: A short version of this paper will be published as ICDE'2018 poste

    DHLP 1&2: Giraph based distributed label propagation algorithms on heterogeneous drug-related networks

    Full text link
    Background and Objective: Heterogeneous complex networks are large graphs consisting of different types of nodes and edges. The knowledge extraction from these networks is complicated. Moreover, the scale of these networks is steadily increasing. Thus, scalable methods are required. Methods: In this paper, two distributed label propagation algorithms for heterogeneous networks, namely DHLP-1 and DHLP-2 have been introduced. Biological networks are one type of the heterogeneous complex networks. As a case study, we have measured the efficiency of our proposed DHLP-1 and DHLP-2 algorithms on a biological network consisting of drugs, diseases, and targets. The subject we have studied in this network is drug repositioning but our algorithms can be used as general methods for heterogeneous networks other than the biological network. Results: We compared the proposed algorithms with similar non-distributed versions of them namely MINProp and Heter-LP. The experiments revealed the good performance of the algorithms in terms of running time and accuracy.Comment: Source code available for Apache Giraph on Hadoo

    Fast community structure local uncovering by independent vertex-centred process

    Get PDF
    This paper addresses the task of community detection and proposes a local approach based on a distributed list building, where each vertex broadcasts basic information that only depends on its degree and that of its neighbours. A decentralised external process then unveils the community structure. The relevance of the proposed method is experimentally shown on both artificial and real data.Comment: 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Aug 2015, Paris, France. Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Minin

    On the Analysis of a Label Propagation Algorithm for Community Detection

    Full text link
    This paper initiates formal analysis of a simple, distributed algorithm for community detection on networks. We analyze an algorithm that we call \textsc{Max-LPA}, both in terms of its convergence time and in terms of the "quality" of the communities detected. \textsc{Max-LPA} is an instance of a class of community detection algorithms called \textit{label propagation} algorithms. As far as we know, most analysis of label propagation algorithms thus far has been empirical in nature and in this paper we seek a theoretical understanding of label propagation algorithms. In our main result, we define a clustered version of \er random graphs with clusters V1,V2,...,VkV_1, V_2,..., V_k where the probability pp, of an edge connecting nodes within a cluster ViV_i is higher than pp', the probability of an edge connecting nodes in distinct clusters. We show that even with fairly general restrictions on pp and pp' (p=Ω(1n1/4ϵ)p = \Omega(\frac{1}{n^{1/4-\epsilon}}) for any ϵ>0\epsilon > 0, p=O(p2)p' = O(p^2), where nn is the number of nodes), \textsc{Max-LPA} detects the clusters V1,V2,...,VnV_1, V_2,..., V_n in just two rounds. Based on this and on empirical results, we conjecture that \textsc{Max-LPA} can correctly and quickly identify communities on clustered \er graphs even when the clusters are much sparser, i.e., with p=clognnp = \frac{c\log n}{n} for some c>1c > 1.Comment: 17 pages. Submitted to ICDCN 201

    Community Detection via Semi-Synchronous Label Propagation Algorithms

    Full text link
    A recently introduced novel community detection strategy is based on a label propagation algorithm (LPA) which uses the diffusion of information in the network to identify communities. Studies of LPAs showed that the strategy is effective in finding a good community structure. Label propagation step can be performed in parallel on all nodes (synchronous model) or sequentially (asynchronous model); both models present some drawback, e.g., algorithm termination is nor granted in the first case, performances can be worst in the second case. In this paper, we present a semi-synchronous version of LPA which aims to combine the advantages of both synchronous and asynchronous models. We prove that our models always converge to a stable labeling. Moreover, we experimentally investigate the effectiveness of the proposed strategy comparing its performance with the asynchronous model both in terms of quality, efficiency and stability. Tests show that the proposed protocol does not harm the quality of the partitioning. Moreover it is quite efficient; each propagation step is extremely parallelizable and it is more stable than the asynchronous model, thanks to the fact that only a small amount of randomization is used by our proposal.Comment: In Proc. of The International Workshop on Business Applications of Social Network Analysis (BASNA '10

    Local Edge Betweenness based Label Propagation for Community Detection in Complex Networks

    Full text link
    Nowadays, identification and detection community structures in complex networks is an important factor in extracting useful information from networks. Label propagation algorithm with near linear-time complexity is one of the most popular methods for detecting community structures, yet its uncertainty and randomness is a defective factor. Merging LPA with other community detection metrics would improve its accuracy and reduce instability of LPA. Considering this point, in this paper we tried to use edge betweenness centrality to improve LPA performance. On the other hand, calculating edge betweenness centrality is expensive, so as an alternative metric, we try to use local edge betweenness and present LPA-LEB (Label Propagation Algorithm Local Edge Betweenness). Experimental results on both real-world and benchmark networks show that LPA-LEB possesses higher accuracy and stability than LPA when detecting community structures in networks.Comment: 6 page
    corecore