54,284 research outputs found

    A parallel self-organizing overlapping community detection algorithm based on swarm intelligence for large scale complex networks

    Get PDF
    Community detection is a critical task for complex network analysis. It helps us to understand the properties of the system that a complex network represents and has significance to a wide range of applications. Though a large number of algorithms have been developed, the detection of overlapping communities from large scale and (or) dynamic networks still remains challenging. In this paper, a Parallel Self-organizing Overlapping Community Detection (PSOCD) algorithm ground on the idea of swarm intelligence is proposed. The PSOCD is designed based on the concept of swarm intelligence system where an analyzed network is treated as a decentralized, self-organized, and self-evolving systems, in which each vertex acts iteratively to join to or leave from communities based on a set of predefined simple vertex action rules. The algorithm is implemented on a distributed graph processing platform named Giraph++; therefore it is capable of analyzing large scale networks. The algorithm is also able to handle overlapping community detection well because a vertex can naturally joins to multiple communities simultaneously. Moreover, if some vertexes and edges are added to or deleted from the analyzed network, the algorithm only needs to adjust community assignments of affected vertexes in the same way as its ending joining communities for a vertex, i.e., it inherently supports dynamic network analysis. The proposed PSOCD is evaluated using a number of variety large scale synthesized and real world networks. Experimental results indicate that the proposed algorithm can effectively discover overlapping communities on large-scale network and the quality of its detected overlapping community structures is superior to two state-of-the-art algorithms, namely Speaker Listener Label Propagation Algorithm (SLPA) and Order Statistics Local Optimization Method (OSLOM), especially on high overlapping density networks and (or) high overlapping diversity networks

    Comparative Evaluation of Community Detection Algorithms: A Topological Approach

    Full text link
    Community detection is one of the most active fields in complex networks analysis, due to its potential value in practical applications. Many works inspired by different paradigms are devoted to the development of algorithmic solutions allowing to reveal the network structure in such cohesive subgroups. Comparative studies reported in the literature usually rely on a performance measure considering the community structure as a partition (Rand Index, Normalized Mutual information, etc.). However, this type of comparison neglects the topological properties of the communities. In this article, we present a comprehensive comparative study of a representative set of community detection methods, in which we adopt both types of evaluation. Community-oriented topological measures are used to qualify the communities and evaluate their deviation from the reference structure. In order to mimic real-world systems, we use artificially generated realistic networks. It turns out there is no equivalence between both approaches: a high performance does not necessarily correspond to correct topological properties, and vice-versa. They can therefore be considered as complementary, and we recommend applying both of them in order to perform a complete and accurate assessment

    Detecting communities using asymptotical Surprise

    Full text link
    Nodes in real-world networks are repeatedly observed to form dense clusters, often referred to as communities. Methods to detect these groups of nodes usually maximize an objective function, which implicitly contains the definition of a community. We here analyze a recently proposed measure called surprise, which assesses the quality of the partition of a network into communities. In its current form, the formulation of surprise is rather difficult to analyze. We here therefore develop an accurate asymptotic approximation. This allows for the development of an efficient algorithm for optimizing surprise. Incidentally, this leads to a straightforward extension of surprise to weighted graphs. Additionally, the approximation makes it possible to analyze surprise more closely and compare it to other methods, especially modularity. We show that surprise is (nearly) unaffected by the well known resolution limit, a particular problem for modularity. However, surprise may tend to overestimate the number of communities, whereas they may be underestimated by modularity. In short, surprise works well in the limit of many small communities, whereas modularity works better in the limit of few large communities. In this sense, surprise is more discriminative than modularity, and may find communities where modularity fails to discern any structure

    Identifying network communities with a high resolution

    Full text link
    Community structure is an important property of complex networks. An automatic discovery of such structure is a fundamental task in many disciplines, including sociology, biology, engineering, and computer science. Recently, several community discovery algorithms have been proposed based on the optimization of a quantity called modularity (Q). However, the problem of modularity optimization is NP-hard, and the existing approaches often suffer from prohibitively long running time or poor quality. Furthermore, it has been recently pointed out that algorithms based on optimizing Q will have a resolution limit, i.e., communities below a certain scale may not be detected. In this research, we first propose an efficient heuristic algorithm, Qcut, which combines spectral graph partitioning and local search to optimize Q. Using both synthetic and real networks, we show that Qcut can find higher modularities and is more scalable than the existing algorithms. Furthermore, using Qcut as an essential component, we propose a recursive algorithm, HQcut, to solve the resolution limit problem. We show that HQcut can successfully detect communities at a much finer scale and with a higher accuracy than the existing algorithms. Finally, we apply Qcut and HQcut to study a protein-protein interaction network, and show that the combination of the two algorithms can reveal interesting biological results that may be otherwise undetectable.Comment: 14 pages, 5 figures. 1 supplemental file at http://cic.cs.wustl.edu/qcut/supplemental.pd
    corecore