1,475 research outputs found
Community detection algorithms: a comparative analysis
Uncovering the community structure exhibited by real networks is a crucial
step towards an understanding of complex systems that goes beyond the local
organization of their constituents. Many algorithms have been proposed so far,
but none of them has been subjected to strict tests to evaluate their
performance. Most of the sporadic tests performed so far involved small
networks with known community structure and/or artificial graphs with a
simplified structure, which is very uncommon in real systems. Here we test
several methods against a recently introduced class of benchmark graphs, with
heterogeneous distributions of degree and community size. The methods are also
tested against the benchmark by Girvan and Newman and on random graphs. As a
result of our analysis, three recent algorithms introduced by Rosvall and
Bergstrom, Blondel et al. and Ronhovde and Nussinov, respectively, have an
excellent performance, with the additional advantage of low computational
complexity, which enables one to analyze large systems.Comment: 12 pages, 8 figures. The software to compute the values of our
general normalized mutual information is available at
http://santo.fortunato.googlepages.com/inthepress
A General Optimization Technique for High Quality Community Detection in Complex Networks
Recent years have witnessed the development of a large body of algorithms for
community detection in complex networks. Most of them are based upon the
optimization of objective functions, among which modularity is the most common,
though a number of alternatives have been suggested in the scientific
literature. We present here an effective general search strategy for the
optimization of various objective functions for community detection purposes.
When applied to modularity, on both real-world and synthetic networks, our
search strategy substantially outperforms the best existing algorithms in terms
of final scores of the objective function; for description length, its
performance is on par with the original Infomap algorithm. The execution time
of our algorithm is on par with non-greedy alternatives present in literature,
and networks of up to 10,000 nodes can be analyzed in time spans ranging from
minutes to a few hours on average workstations, making our approach readily
applicable to tasks which require the quality of partitioning to be as high as
possible, and are not limited by strict time constraints. Finally, based on the
most effective of the available optimization techniques, we compare the
performance of modularity and code length as objective functions, in terms of
the quality of the partitions one can achieve by optimizing them. To this end,
we evaluated the ability of each objective function to reconstruct the
underlying structure of a large set of synthetic and real-world networks.Comment: MAIN text: 14 pages, 4 figures, 1 table Supplementary information: 19
pages, 8 figures, 5 table
Seeding for pervasively overlapping communities
In some social and biological networks, the majority of nodes belong to
multiple communities. It has recently been shown that a number of the
algorithms that are designed to detect overlapping communities do not perform
well in such highly overlapping settings. Here, we consider one class of these
algorithms, those which optimize a local fitness measure, typically by using a
greedy heuristic to expand a seed into a community. We perform synthetic
benchmarks which indicate that an appropriate seeding strategy becomes
increasingly important as the extent of community overlap increases. We find
that distinct cliques provide the best seeds. We find further support for this
seeding strategy with benchmarks on a Facebook network and the yeast
interactome.Comment: 8 Page
A Systematic Analysis of Community Detection in Complex Networks
Numerous techniques have been proposed by researchers to uncover the hidden patterns of real-world complex networks. Finding a hidden community is one of the crucial tasks for community detection in complex networks. Despite the presence of multiple methods for community detection, identification of the best performing method over different complex networks is still an open research question. In this article, we analyzed eight state-of-the-art community detection algorithms on nine complex networks of varying sizes covering various domains including animal, biomedical, terrorist, social, and human contacts. The objective of this article is to identify the best performing algorithm for community detection in real-world complex networks of various sizes and from different domains. The obtained results over 100 iterations demonstrated that the multi-scale method has outperformed the other techniques in terms of accuracy. Multi-scale method achieved 0.458 average value of modularity metric whereas multiple screening resolution, unfolding fast, greedy, multi-resolution, local fitness optimization, sparse Geosocial community detection algorithm, and spectral clustering, respectively obtained the modularity values 0.455, 0.441, 0.436, 0.421, 0.368, 0.341, and 0.340.
Community detection in networks via nonlinear modularity eigenvectors
Revealing a community structure in a network or dataset is a central problem
arising in many scientific areas. The modularity function is an established
measure quantifying the quality of a community, being identified as a set of
nodes having high modularity. In our terminology, a set of nodes with positive
modularity is called a \textit{module} and a set that maximizes is thus
called \textit{leading module}. Finding a leading module in a network is an
important task, however the dimension of real-world problems makes the
maximization of unfeasible. This poses the need of approximation techniques
which are typically based on a linear relaxation of , induced by the
spectrum of the modularity matrix . In this work we propose a nonlinear
relaxation which is instead based on the spectrum of a nonlinear modularity
operator . We show that extremal eigenvalues of
provide an exact relaxation of the modularity measure , however at the price
of being more challenging to be computed than those of . Thus we extend the
work made on nonlinear Laplacians, by proposing a computational scheme, named
\textit{generalized RatioDCA}, to address such extremal eigenvalues. We show
monotonic ascent and convergence of the method. We finally apply the new method
to several synthetic and real-world data sets, showing both effectiveness of
the model and performance of the method
Exploring the Free Energy Landscape: From Dynamics to Networks and Back
The knowledge of the Free Energy Landscape topology is the essential key to
understand many biochemical processes. The determination of the conformers of a
protein and their basins of attraction takes a central role for studying
molecular isomerization reactions. In this work, we present a novel framework
to unveil the features of a Free Energy Landscape answering questions such as
how many meta-stable conformers are, how the hierarchical relationship among
them is, or what the structure and kinetics of the transition paths are.
Exploring the landscape by molecular dynamics simulations, the microscopic data
of the trajectory are encoded into a Conformational Markov Network. The
structure of this graph reveals the regions of the conformational space
corresponding to the basins of attraction. In addition, handling the
Conformational Markov Network, relevant kinetic magnitudes as dwell times or
rate constants, and the hierarchical relationship among basins, complete the
global picture of the landscape. We show the power of the analysis studying a
toy model of a funnel-like potential and computing efficiently the conformers
of a short peptide, the dialanine, paving the way to a systematic study of the
Free Energy Landscape in large peptides.Comment: PLoS Computational Biology (in press
- …