21,211 research outputs found
Model selection and hypothesis testing for large-scale network models with overlapping groups
The effort to understand network systems in increasing detail has resulted in
a diversity of methods designed to extract their large-scale structure from
data. Unfortunately, many of these methods yield diverging descriptions of the
same network, making both the comparison and understanding of their results a
difficult challenge. A possible solution to this outstanding issue is to shift
the focus away from ad hoc methods and move towards more principled approaches
based on statistical inference of generative models. As a result, we face
instead the more well-defined task of selecting between competing generative
processes, which can be done under a unified probabilistic framework. Here, we
consider the comparison between a variety of generative models including
features such as degree correction, where nodes with arbitrary degrees can
belong to the same group, and community overlap, where nodes are allowed to
belong to more than one group. Because such model variants possess an
increasing number of parameters, they become prone to overfitting. In this
work, we present a method of model selection based on the minimum description
length criterion and posterior odds ratios that is capable of fully accounting
for the increased degrees of freedom of the larger models, and selects the best
one according to the statistical evidence available in the data. In applying
this method to many empirical unweighted networks from different fields, we
observe that community overlap is very often not supported by statistical
evidence and is selected as a better model only for a minority of them. On the
other hand, we find that degree correction tends to be almost universally
favored by the available data, implying that intrinsic node proprieties (as
opposed to group properties) are often an essential ingredient of network
formation.Comment: 20 pages,7 figures, 1 tabl
Information dynamics algorithm for detecting communities in networks
The problem of community detection is relevant in many scientific
disciplines, from social science to statistical physics. Given the impact of
community detection in many areas, such as psychology and social sciences, we
have addressed the issue of modifying existing well performing algorithms by
incorporating elements of the domain application fields, i.e. domain-inspired.
We have focused on a psychology and social network - inspired approach which
may be useful for further strengthening the link between social network studies
and mathematics of community detection. Here we introduce a community-detection
algorithm derived from the van Dongen's Markov Cluster algorithm (MCL) method
by considering networks' nodes as agents capable to take decisions. In this
framework we have introduced a memory factor to mimic a typical human behavior
such as the oblivion effect. The method is based on information diffusion and
it includes a non-linear processing phase. We test our method on two classical
community benchmark and on computer generated networks with known community
structure. Our approach has three important features: the capacity of detecting
overlapping communities, the capability of identifying communities from an
individual point of view and the fine tuning the community detectability with
respect to prior knowledge of the data. Finally we discuss how to use a Shannon
entropy measure for parameter estimation in complex networks.Comment: Submitted to "Communication in Nonlinear Science and Numerical
Simulation
Detecting Cohesive and 2-mode Communities in Directed and Undirected Networks
Networks are a general language for representing relational information among
objects. An effective way to model, reason about, and summarize networks, is to
discover sets of nodes with common connectivity patterns. Such sets are
commonly referred to as network communities. Research on network community
detection has predominantly focused on identifying communities of densely
connected nodes in undirected networks.
In this paper we develop a novel overlapping community detection method that
scales to networks of millions of nodes and edges and advances research along
two dimensions: the connectivity structure of communities, and the use of edge
directedness for community detection. First, we extend traditional definitions
of network communities by building on the observation that nodes can be densely
interlinked in two different ways: In cohesive communities nodes link to each
other, while in 2-mode communities nodes link in a bipartite fashion, where
links predominate between the two partitions rather than inside them. Our
method successfully detects both 2-mode as well as cohesive communities, that
may also overlap or be hierarchically nested. Second, while most existing
community detection methods treat directed edges as though they were
undirected, our method accounts for edge directions and is able to identify
novel and meaningful community structures in both directed and undirected
networks, using data from social, biological, and ecological domains.Comment: Published in the proceedings of WSDM '1
- …