10 research outputs found
Phase transition in the detection of modules in sparse networks
We present an asymptotically exact analysis of the problem of detecting
communities in sparse random networks. Our results are also applicable to
detection of functional modules, partitions, and colorings in noisy planted
models. Using a cavity method analysis, we unveil a phase transition from a
region where the original group assignment is undetectable to one where
detection is possible. In some cases, the detectable region splits into an
algorithmically hard region and an easy one. Our approach naturally translates
into a practical algorithm for detecting modules in sparse networks, and
learning the parameters of the underlying model.Comment: 4 pages, 4 figure
Spectral redemption: clustering sparse networks
Spectral algorithms are classic approaches to clustering and community
detection in networks. However, for sparse networks the standard versions of
these algorithms are suboptimal, in some cases completely failing to detect
communities even when other algorithms such as belief propagation can do so.
Here we introduce a new class of spectral algorithms based on a
non-backtracking walk on the directed edges of the graph. The spectrum of this
operator is much better-behaved than that of the adjacency matrix or other
commonly used matrices, maintaining a strong separation between the bulk
eigenvalues and the eigenvalues relevant to community structure even in the
sparse case. We show that our algorithm is optimal for graphs generated by the
stochastic block model, detecting communities all the way down to the
theoretical limit. We also show the spectrum of the non-backtracking operator
for some real-world networks, illustrating its advantages over traditional
spectral clustering.Comment: 11 pages, 6 figures. Clarified to what extent our claims are
rigorous, and to what extent they are conjectures; also added an
interpretation of the eigenvectors of the 2n-dimensional version of the
non-backtracking matri
Spatial correlations in attribute communities
Community detection is an important tool for exploring and classifying the
properties of large complex networks and should be of great help for spatial
networks. Indeed, in addition to their location, nodes in spatial networks can
have attributes such as the language for individuals, or any other
socio-economical feature that we would like to identify in communities. We
discuss in this paper a crucial aspect which was not considered in previous
studies which is the possible existence of correlations between space and
attributes. Introducing a simple toy model in which both space and node
attributes are considered, we discuss the effect of space-attribute
correlations on the results of various community detection methods proposed for
spatial networks in this paper and in previous studies. When space is
irrelevant, our model is equivalent to the stochastic block model which has
been shown to display a detectability-non detectability transition. In the
regime where space dominates the link formation process, most methods can fail
to recover the communities, an effect which is particularly marked when
space-attributes correlations are strong. In this latter case, community
detection methods which remove the spatial component of the network can miss a
large part of the community structure and can lead to incorrect results.Comment: 10 pages and 7 figure
Inference of hidden structures in complex physical systems by multi-scale clustering
We survey the application of a relatively new branch of statistical
physics--"community detection"-- to data mining. In particular, we focus on the
diagnosis of materials and automated image segmentation. Community detection
describes the quest of partitioning a complex system involving many elements
into optimally decoupled subsets or communities of such elements. We review a
multiresolution variant which is used to ascertain structures at different
spatial and temporal scales. Significant patterns are obtained by examining the
correlations between different independent solvers. Similar to other
combinatorial optimization problems in the NP complexity class, community
detection exhibits several phases. Typically, illuminating orders are revealed
by choosing parameters that lead to extremal information theory correlations.Comment: 25 pages, 16 Figures; a review of earlier work
A Replica Inference Approach to Unsupervised Multi-Scale Image Segmentation
We apply a replica inference based Potts model method to unsupervised image
segmentation on multiple scales. This approach was inspired by the statistical
mechanics problem of "community detection" and its phase diagram. Specifically,
the problem is cast as identifying tightly bound clusters ("communities" or
"solutes") against a background or "solvent". Within our multiresolution
approach, we compute information theory based correlations among multiple
solutions ("replicas") of the same graph over a range of resolutions.
Significant multiresolution structures are identified by replica correlations
as manifest in information theory overlaps. With the aid of these correlations
as well as thermodynamic measures, the phase diagram of the corresponding Potts
model is analyzed both at zero and finite temperatures. Optimal parameters
corresponding to a sensible unsupervised segmentation correspond to the "easy
phase" of the Potts model. Our algorithm is fast and shown to be at least as
accurate as the best algorithms to date and to be especially suited to the
detection of camouflaged images.Comment: 26 pages, 22 figure
Probabilistic Reconstruction in Compressed Sensing: Algorithms, Phase Diagrams, and Threshold Achieving Matrices
Compressed sensing is a signal processing method that acquires data directly
in a compressed form. This allows one to make less measurements than what was
considered necessary to record a signal, enabling faster or more precise
measurement protocols in a wide range of applications. Using an
interdisciplinary approach, we have recently proposed in [arXiv:1109.4424] a
strategy that allows compressed sensing to be performed at acquisition rates
approaching to the theoretical optimal limits. In this paper, we give a more
thorough presentation of our approach, and introduce many new results. We
present the probabilistic approach to reconstruction and discuss its optimality
and robustness. We detail the derivation of the message passing algorithm for
reconstruction and expectation max- imization learning of signal-model
parameters. We further develop the asymptotic analysis of the corresponding
phase diagrams with and without measurement noise, for different distribution
of signals, and discuss the best possible reconstruction performances
regardless of the algorithm. We also present new efficient seeding matrices,
test them on synthetic data and analyze their performance asymptotically.Comment: 42 pages, 37 figures, 3 appendixe
Statistical physics of network communities in economic systems
In the last decade, the study of big networked systems has received a great deal of attention
thanks to the increased availability of large datasets and the technology to analyze them. To unravel
regularities and behaviours from his enormous quantity of data and supply suitable models, we need
appropriate tools, one of them being community detection. Finding meaningful communities in a
networks is still a diffcult task but essential to unveil functional relations between the parts.
The research presented here has been carried out focusing on community detection; in particular
were considered cases where the spatial component was relevant or intrinsic. It is indeed true that,
nowadays, many systems, represented as complex networks, are affected, more or less naturally,
by the geographical distance, location and organization. This holds true even for economic events:
it has been proved that trade and exchanges between countries are necessarily suffocated by the
geographical proximity or impeded by natural obstacles.
Still, community detection alone is not sufficient to describe the whole picture, since it gives
no information about the internal structure of a community. Therefore we developed the novel
core detection method, natural counterpart of the community detection algorithm and meant to be
performed alongside it, which is, at the same time, simple and powerful.
We aim to apply community detection and core detection methodologies to the analysis of the
global market and its functioning, in order to understand the origin of economic turmoils and critical
events.
In this work we analyze different economic systems from a complex network perspective and
find some interesting results: we study patent data in order to measure internationalization of
European countries and assess the effectiveness of EU policies; we examine the dynamics of network
effects on the performances of individual countries and trade relationships in the International Trade
Network; we represent World Input-Output data as an interdependent complex network and study
its properties, showing evidence of the crisis .
Thanks to both community and core detection, we are able to have a deeper insight on the
inner workings of community formation, we can identify the leading members in a group and reveal
in
uence basins, unknown otherwise
Multidimensional Network analysis
This thesis is focused on the study of multidimensional networks. A multidimensional network is a network in which among the nodes there may be multiple different qualitative and quantitative relations. Traditionally, complex network analysis has focused on networks with only one kind of relation. Even with this constraint, monodimensional networks posed many analytic challenges, being representations of ubiquitous complex systems in nature. However, it is a matter of common experience that the constraint of considering only one single relation at a time limits the set of real world phenomena that can be represented with complex networks. When multiple different relations act at the same time, traditional complex network analysis cannot provide suitable analytic tools. To provide the suitable tools for this scenario is exactly the aim of this thesis: the creation and study of a Multidimensional Network Analysis, to extend the toolbox of complex network analysis and grasp the complexity of real world phenomena. The urgency and need for a multidimensional network analysis is here presented, along with an empirical proof of the ubiquity of this multifaceted reality in different complex networks, and some related works that in the last two years were proposed in this novel setting, yet to be systematically defined. Then, we tackle the foundations of the multidimensional setting at different levels, both by looking at the basic extensions of the known model and by developing novel algorithms and frameworks for well-understood and useful problems, such as community discovery (our main case study), temporal analysis, link prediction and more. We conclude this thesis with two real world scenarios: a monodimensional study of international trade, that may be improved with our proposed multidimensional analysis; and the analysis of literature and bibliography in the field of classical archaeology, used to show how natural and useful the choice of a multidimensional network analysis strategy is in a problem traditionally tackled with different techniques