79 research outputs found
Detecting communities using asymptotical Surprise
Nodes in real-world networks are repeatedly observed to form dense clusters,
often referred to as communities. Methods to detect these groups of nodes
usually maximize an objective function, which implicitly contains the
definition of a community. We here analyze a recently proposed measure called
surprise, which assesses the quality of the partition of a network into
communities. In its current form, the formulation of surprise is rather
difficult to analyze. We here therefore develop an accurate asymptotic
approximation. This allows for the development of an efficient algorithm for
optimizing surprise. Incidentally, this leads to a straightforward extension of
surprise to weighted graphs. Additionally, the approximation makes it possible
to analyze surprise more closely and compare it to other methods, especially
modularity. We show that surprise is (nearly) unaffected by the well known
resolution limit, a particular problem for modularity. However, surprise may
tend to overestimate the number of communities, whereas they may be
underestimated by modularity. In short, surprise works well in the limit of
many small communities, whereas modularity works better in the limit of few
large communities. In this sense, surprise is more discriminative than
modularity, and may find communities where modularity fails to discern any
structure
Graph analysis and modularity of brain functional connectivity networks: searching for the optimal threshold
Neuroimaging data can be represented as networks of nodes and edges that
capture the topological organization of the brain connectivity. Graph theory
provides a general and powerful framework to study these networks and their
structure at various scales. By way of example, community detection methods
have been widely applied to investigate the modular structure of many natural
networks, including brain functional connectivity networks. Sparsification
procedures are often applied to remove the weakest edges, which are the most
affected by experimental noise, and to reduce the density of the graph, thus
making it theoretically and computationally more tractable. However, weak links
may also contain significant structural information, and procedures to identify
the optimal tradeoff are the subject of active research. Here, we explore the
use of percolation analysis, a method grounded in statistical physics, to
identify the optimal sparsification threshold for community detection in brain
connectivity networks. By using synthetic networks endowed with a ground-truth
modular structure and realistic topological features typical of human brain
functional connectivity networks, we show that percolation analysis can be
applied to identify the optimal sparsification threshold that maximizes
information on the networks' community structure. We validate this approach
using three different community detection methods widely applied to the
analysis of brain connectivity networks: Newman's modularity, InfoMap and
Asymptotical Surprise. Importantly, we test the effects of noise and data
variability, which are critical factors to determine the optimal threshold.
This data-driven method should prove particularly useful in the analysis of the
community structure of brain networks in populations characterized by different
connectivity strengths, such as patients and controls.Comment: 15 pages, 7 figure
Detecting Core-Periphery Structures by Surprise
Detecting the presence of mesoscale structures in complex networks is of
primary importance. This is especially true for financial networks, whose
structural organization deeply affects their resilience to events like default
cascades, shocks propagation, etc. Several methods have been proposed, so far,
to detect communities, i.e. groups of nodes whose connectivity is significantly
large. Communities, however do not represent the only kind of mesoscale
structures characterizing real-world networks: other examples are provided by
bow-tie structures, core-periphery structures and bipartite structures. Here we
propose a novel method to detect statistically-signifcant bimodular structures,
i.e. either bipartite or core-periphery ones. It is based on a modification of
the surprise, recently proposed for detecting communities. Our variant allows
for bimodular nodes partitions to be revealed, by letting links to be placed
either 1) within the core part and between the core and the periphery parts or
2) just between the (empty) layers of a bipartite network. From a technical
point of view, this is achieved by employing a multinomial hypergeometric
distribution instead of the traditional (binomial) hypergeometric one; as in
the latter case, this allows a p-value to be assigned to any given
(bi)partition of the nodes. To illustrate the performance of our method, we
report the results of its application to several real-world networks, including
social, economic and financial ones.Comment: 11 pages, 10 figures. Python code freely available at
https://github.com/jeroenvldj/bimodular_surpris
The Bayan Algorithm: Detecting Communities in Networks Through Exact and Approximate Optimization of Modularity
Community detection is a classic problem in network science with extensive
applications in various fields. Among numerous approaches, the most common
method is modularity maximization. Despite their design philosophy and wide
adoption, heuristic modularity maximization algorithms rarely return an optimal
partition or anything similar. We propose a specialized algorithm, Bayan, which
returns partitions with a guarantee of either optimality or proximity to an
optimal partition. At the core of the Bayan algorithm is a branch-and-cut
scheme that solves an integer programming formulation of the problem to
optimality or approximate it within a factor. We demonstrate Bayan's
distinctive accuracy and stability over 21 other algorithms in retrieving
ground-truth communities in synthetic benchmarks and node labels in real
networks. Bayan is several times faster than open-source and commercial solvers
for modularity maximization making it capable of finding optimal partitions for
instances that cannot be optimized by any other existing method. Overall, our
assessments point to Bayan as a suitable choice for exact maximization of
modularity in networks with up to 3000 edges (in their largest connected
component) and approximating maximum modularity in larger networks on ordinary
computers.Comment: 6 pages, 2 figures, 1 tabl
- …