54,284 research outputs found
A parallel self-organizing overlapping community detection algorithm based on swarm intelligence for large scale complex networks
Community detection is a critical task for complex network analysis. It helps us to understand the properties of the system that a complex network represents and has significance to a wide range of applications. Though a large number of algorithms have been developed, the detection of overlapping communities from large scale and (or) dynamic networks still remains challenging. In this paper, a Parallel Self-organizing Overlapping Community Detection (PSOCD) algorithm ground on the idea of swarm intelligence is proposed. The PSOCD is designed based on the concept of swarm intelligence system where an analyzed network is treated as a decentralized, self-organized, and
self-evolving systems, in which each vertex acts iteratively to join to or leave from communities based on a set of predefined simple vertex action rules. The algorithm is implemented on a distributed graph processing platform named Giraph++; therefore it is capable of analyzing large scale networks. The algorithm is also able to handle overlapping community detection well because a vertex can naturally joins to multiple communities simultaneously. Moreover, if some vertexes and edges are added to or deleted from the analyzed network, the algorithm only needs to adjust community assignments of affected vertexes in the same way as its ending joining communities for a vertex, i.e., it inherently supports dynamic network analysis. The proposed PSOCD is evaluated using a number of variety large scale synthesized and real world networks. Experimental results indicate that the proposed algorithm can effectively discover overlapping communities on large-scale network and the quality of its detected overlapping community structures is superior to two state-of-the-art algorithms, namely Speaker Listener Label Propagation Algorithm (SLPA) and Order Statistics Local Optimization Method (OSLOM), especially on high overlapping density networks and (or) high overlapping diversity networks
Comparative Evaluation of Community Detection Algorithms: A Topological Approach
Community detection is one of the most active fields in complex networks
analysis, due to its potential value in practical applications. Many works
inspired by different paradigms are devoted to the development of algorithmic
solutions allowing to reveal the network structure in such cohesive subgroups.
Comparative studies reported in the literature usually rely on a performance
measure considering the community structure as a partition (Rand Index,
Normalized Mutual information, etc.). However, this type of comparison neglects
the topological properties of the communities. In this article, we present a
comprehensive comparative study of a representative set of community detection
methods, in which we adopt both types of evaluation. Community-oriented
topological measures are used to qualify the communities and evaluate their
deviation from the reference structure. In order to mimic real-world systems,
we use artificially generated realistic networks. It turns out there is no
equivalence between both approaches: a high performance does not necessarily
correspond to correct topological properties, and vice-versa. They can
therefore be considered as complementary, and we recommend applying both of
them in order to perform a complete and accurate assessment
Detecting communities using asymptotical Surprise
Nodes in real-world networks are repeatedly observed to form dense clusters,
often referred to as communities. Methods to detect these groups of nodes
usually maximize an objective function, which implicitly contains the
definition of a community. We here analyze a recently proposed measure called
surprise, which assesses the quality of the partition of a network into
communities. In its current form, the formulation of surprise is rather
difficult to analyze. We here therefore develop an accurate asymptotic
approximation. This allows for the development of an efficient algorithm for
optimizing surprise. Incidentally, this leads to a straightforward extension of
surprise to weighted graphs. Additionally, the approximation makes it possible
to analyze surprise more closely and compare it to other methods, especially
modularity. We show that surprise is (nearly) unaffected by the well known
resolution limit, a particular problem for modularity. However, surprise may
tend to overestimate the number of communities, whereas they may be
underestimated by modularity. In short, surprise works well in the limit of
many small communities, whereas modularity works better in the limit of few
large communities. In this sense, surprise is more discriminative than
modularity, and may find communities where modularity fails to discern any
structure
Identifying network communities with a high resolution
Community structure is an important property of complex networks. An
automatic discovery of such structure is a fundamental task in many
disciplines, including sociology, biology, engineering, and computer science.
Recently, several community discovery algorithms have been proposed based on
the optimization of a quantity called modularity (Q). However, the problem of
modularity optimization is NP-hard, and the existing approaches often suffer
from prohibitively long running time or poor quality. Furthermore, it has been
recently pointed out that algorithms based on optimizing Q will have a
resolution limit, i.e., communities below a certain scale may not be detected.
In this research, we first propose an efficient heuristic algorithm, Qcut,
which combines spectral graph partitioning and local search to optimize Q.
Using both synthetic and real networks, we show that Qcut can find higher
modularities and is more scalable than the existing algorithms. Furthermore,
using Qcut as an essential component, we propose a recursive algorithm, HQcut,
to solve the resolution limit problem. We show that HQcut can successfully
detect communities at a much finer scale and with a higher accuracy than the
existing algorithms. Finally, we apply Qcut and HQcut to study a
protein-protein interaction network, and show that the combination of the two
algorithms can reveal interesting biological results that may be otherwise
undetectable.Comment: 14 pages, 5 figures. 1 supplemental file at
http://cic.cs.wustl.edu/qcut/supplemental.pd
- …