16,610 research outputs found
Statistical Mechanics of Community Detection
Starting from a general \textit{ansatz}, we show how community detection can
be interpreted as finding the ground state of an infinite range spin glass. Our
approach applies to weighted and directed networks alike. It contains the
\textit{at hoc} introduced quality function from \cite{ReichardtPRL} and the
modularity as defined by Newman and Girvan \cite{Girvan03} as special
cases. The community structure of the network is interpreted as the spin
configuration that minimizes the energy of the spin glass with the spin states
being the community indices. We elucidate the properties of the ground state
configuration to give a concise definition of communities as cohesive subgroups
in networks that is adaptive to the specific class of network under study.
Further we show, how hierarchies and overlap in the community structure can be
detected. Computationally effective local update rules for optimization
procedures to find the ground state are given. We show how the \textit{ansatz}
may be used to discover the community around a given node without detecting all
communities in the full network and we give benchmarks for the performance of
this extension. Finally, we give expectation values for the modularity of
random graphs, which can be used in the assessment of statistical significance
of community structure
Graph Theory and Networks in Biology
In this paper, we present a survey of the use of graph theoretical techniques
in Biology. In particular, we discuss recent work on identifying and modelling
the structure of bio-molecular networks, as well as the application of
centrality measures to interaction networks and research on the hierarchical
structure of such networks and network motifs. Work on the link between
structural network properties and dynamics is also described, with emphasis on
synchronization and disease propagation.Comment: 52 pages, 5 figures, Survey Pape
A nonuniform popularity-similarity optimization (nPSO) model to efficiently generate realistic complex networks with communities
The hidden metric space behind complex network topologies is a fervid topic
in current network science and the hyperbolic space is one of the most studied,
because it seems associated to the structural organization of many real complex
systems. The Popularity-Similarity-Optimization (PSO) model simulates how
random geometric graphs grow in the hyperbolic space, reproducing strong
clustering and scale-free degree distribution, however it misses to reproduce
an important feature of real complex networks, which is the community
organization. The Geometrical-Preferential-Attachment (GPA) model was recently
developed to confer to the PSO also a community structure, which is obtained by
forcing different angular regions of the hyperbolic disk to have variable level
of attractiveness. However, the number and size of the communities cannot be
explicitly controlled in the GPA, which is a clear limitation for real
applications. Here, we introduce the nonuniform PSO (nPSO) model that,
differently from GPA, forces heterogeneous angular node attractiveness by
sampling the angular coordinates from a tailored nonuniform probability
distribution, for instance a mixture of Gaussians. The nPSO differs from GPA in
other three aspects: it allows to explicitly fix the number and size of
communities; it allows to tune their mixing property through the network
temperature; it is efficient to generate networks with high clustering. After
several tests we propose the nPSO as a valid and efficient model to generate
networks with communities in the hyperbolic space, which can be adopted as a
realistic benchmark for different tasks such as community detection and link
prediction
Communication Theoretic Data Analytics
Widespread use of the Internet and social networks invokes the generation of
big data, which is proving to be useful in a number of applications. To deal
with explosively growing amounts of data, data analytics has emerged as a
critical technology related to computing, signal processing, and information
networking. In this paper, a formalism is considered in which data is modeled
as a generalized social network and communication theory and information theory
are thereby extended to data analytics. First, the creation of an equalizer to
optimize information transfer between two data variables is considered, and
financial data is used to demonstrate the advantages. Then, an information
coupling approach based on information geometry is applied for dimensionality
reduction, with a pattern recognition example to illustrate the effectiveness.
These initial trials suggest the potential of communication theoretic data
analytics for a wide range of applications.Comment: Published in IEEE Journal on Selected Areas in Communications, Jan.
201
Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results
The classical setting of community detection consists of networks exhibiting
a clustered structure. To more accurately model real systems we consider a
class of networks (i) whose edges may carry labels and (ii) which may lack a
clustered structure. Specifically we assume that nodes possess latent
attributes drawn from a general compact space and edges between two nodes are
randomly generated and labeled according to some unknown distribution as a
function of their latent attributes. Our goal is then to infer the edge label
distributions from a partially observed network. We propose a computationally
efficient spectral algorithm and show it allows for asymptotically correct
inference when the average node degree could be as low as logarithmic in the
total number of nodes. Conversely, if the average node degree is below a
specific constant threshold, we show that no algorithm can achieve better
inference than guessing without using the observations. As a byproduct of our
analysis, we show that our model provides a general procedure to construct
random graph models with a spectrum asymptotic to a pre-specified eigenvalue
distribution such as a power-law distribution.Comment: 17 page
Modeling Relational Data via Latent Factor Blockmodel
In this paper we address the problem of modeling relational data, which
appear in many applications such as social network analysis, recommender
systems and bioinformatics. Previous studies either consider latent feature
based models but disregarding local structure in the network, or focus
exclusively on capturing local structure of objects based on latent blockmodels
without coupling with latent characteristics of objects. To combine the
benefits of the previous work, we propose a novel model that can simultaneously
incorporate the effect of latent features and covariates if any, as well as the
effect of latent structure that may exist in the data. To achieve this, we
model the relation graph as a function of both latent feature factors and
latent cluster memberships of objects to collectively discover globally
predictive intrinsic properties of objects and capture latent block structure
in the network to improve prediction performance. We also develop an
optimization transfer algorithm based on the generalized EM-style strategy to
learn the latent factors. We prove the efficacy of our proposed model through
the link prediction task and cluster analysis task, and extensive experiments
on the synthetic data and several real world datasets suggest that our proposed
LFBM model outperforms the other state of the art approaches in the evaluated
tasks.Comment: 10 pages, 12 figure
Distance entropy cartography characterises centrality in complex networks
We introduce distance entropy as a measure of homogeneity in the distribution
of path lengths between a given node and its neighbours in a complex network.
Distance entropy defines a new centrality measure whose properties are
investigated for a variety of synthetic network models. By coupling distance
entropy information with closeness centrality, we introduce a network
cartography which allows one to reduce the degeneracy of ranking based on
closeness alone. We apply this methodology to the empirical multiplex lexical
network encoding the linguistic relationships known to English speaking
toddlers. We show that the distance entropy cartography better predicts how
children learn words compared to closeness centrality. Our results highlight
the importance of distance entropy for gaining insights from distance patterns
in complex networks.Comment: 11 page
- …