744 research outputs found
Obtaining Communities with a Fitness Growth Process
The study of community structure has been a hot topic of research over the
last years. But, while successfully applied in several areas, the concept lacks
of a general and precise notion. Facts like the hierarchical structure and
heterogeneity of complex networks make it difficult to unify the idea of
community and its evaluation. The global functional known as modularity is
probably the most used technique in this area. Nevertheless, its limits have
been deeply studied. Local techniques as the ones by Lancichinetti et al. and
Palla et al. arose as an answer to the resolution limit and degeneracies that
modularity has.
Here we start from the algorithm by Lancichinetti et al. and propose a unique
growth process for a fitness function that, while being local, finds a
community partition that covers the whole network, updating the scale parameter
dynamically. We test the quality of our results by using a set of benchmarks of
heterogeneous graphs. We discuss alternative measures for evaluating the
community structure and, in the light of them, infer possible explanations for
the better performance of local methods compared to global ones in these cases
Community detection algorithms: a comparative analysis
Uncovering the community structure exhibited by real networks is a crucial
step towards an understanding of complex systems that goes beyond the local
organization of their constituents. Many algorithms have been proposed so far,
but none of them has been subjected to strict tests to evaluate their
performance. Most of the sporadic tests performed so far involved small
networks with known community structure and/or artificial graphs with a
simplified structure, which is very uncommon in real systems. Here we test
several methods against a recently introduced class of benchmark graphs, with
heterogeneous distributions of degree and community size. The methods are also
tested against the benchmark by Girvan and Newman and on random graphs. As a
result of our analysis, three recent algorithms introduced by Rosvall and
Bergstrom, Blondel et al. and Ronhovde and Nussinov, respectively, have an
excellent performance, with the additional advantage of low computational
complexity, which enables one to analyze large systems.Comment: 12 pages, 8 figures. The software to compute the values of our
general normalized mutual information is available at
http://santo.fortunato.googlepages.com/inthepress
Semi-Supervised Overlapping Community Finding based on Label Propagation with Pairwise Constraints
Algorithms for detecting communities in complex networks are generally
unsupervised, relying solely on the structure of the network. However, these
methods can often fail to uncover meaningful groupings that reflect the
underlying communities in the data, particularly when those structures are
highly overlapping. One way to improve the usefulness of these algorithms is by
incorporating additional background information, which can be used as a source
of constraints to direct the community detection process. In this work, we
explore the potential of semi-supervised strategies to improve algorithms for
finding overlapping communities in networks. Specifically, we propose a new
method, based on label propagation, for finding communities using a limited
number of pairwise constraints. Evaluations on synthetic and real-world
datasets demonstrate the potential of this approach for uncovering meaningful
community structures in cases where each node can potentially belong to more
than one community.Comment: Fix table
Modularity functions maximization with nonnegative relaxation facilitates community detection in networks
We show here that the problem of maximizing a family of quantitative
functions, encompassing both the modularity (Q-measure) and modularity density
(D-measure), for community detection can be uniformly understood as a
combinatoric optimization involving the trace of a matrix called modularity
Laplacian. Instead of using traditional spectral relaxation, we apply
additional nonnegative constraint into this graph clustering problem and design
efficient algorithms to optimize the new objective. With the explicit
nonnegative constraint, our solutions are very close to the ideal community
indicator matrix and can directly assign nodes into communities. The
near-orthogonal columns of the solution can be reformulated as the posterior
probability of corresponding node belonging to each community. Therefore, the
proposed method can be exploited to identify the fuzzy or overlapping
communities and thus facilitates the understanding of the intrinsic structure
of networks. Experimental results show that our new algorithm consistently,
sometimes significantly, outperforms the traditional spectral relaxation
approaches
Robustness of journal rankings by network flows with different amounts of memory
As the number of scientific journals has multiplied, journal rankings have
become increasingly important for scientific decisions. From submissions and
subscriptions to grants and hirings, researchers, policy makers, and funding
agencies make important decisions with influence from journal rankings such as
the ISI journal impact factor. Typically, the rankings are derived from the
citation network between a selection of journals and unavoidably depend on this
selection. However, little is known about how robust rankings are to the
selection of included journals. Here we compare the robustness of three journal
rankings based on network flows induced on citation networks. They model
pathways of researchers navigating scholarly literature, stepping between
journals and remembering their previous steps to different degree: zero-step
memory as impact factor, one-step memory as Eigenfactor, and two-step memory,
corresponding to zero-, first-, and second-order Markov models of citation flow
between journals. We conclude that higher-order Markov models perform better
and are more robust to the selection of journals. Whereas our analysis
indicates that higher-order models perform better, the performance gain for the
second-order Markov model comes at the cost of requiring more citation data
over a longer time period.Comment: 9 pages, 5 figure
Mapping bilateral information interests using the activity of Wikipedia editors
We live in a global village where electronic communication has eliminated the
geographical barriers of information exchange. The road is now open to
worldwide convergence of information interests, shared values, and
understanding. Nevertheless, interests still vary between countries around the
world. This raises important questions about what today's world map of in-
formation interests actually looks like and what factors cause the barriers of
information exchange between countries. To quantitatively construct a world map
of information interests, we devise a scalable statistical model that
identifies countries with similar information interests and measures the
countries' bilateral similarities. From the similarities we connect countries
in a global network and find that countries can be mapped into 18 clusters with
similar information interests. Through regression we find that language and
religion best explain the strength of the bilateral ties and formation of
clusters. Our findings provide a quantitative basis for further studies to
better understand the complex interplay between shared interests and conflict
on a global scale. The methodology can also be extended to track changes over
time and capture important trends in global information exchange.Comment: 11 pages, 3 figures in Palgrave Communications 1 (2015
Identifying modular flows on multilayer networks reveals highly overlapping organization in social systems
Unveiling the community structure of networks is a powerful methodology to
comprehend interconnected systems across the social and natural sciences. To
identify different types of functional modules in interaction data aggregated
in a single network layer, researchers have developed many powerful methods.
For example, flow-based methods have proven useful for identifying modular
dynamics in weighted and directed networks that capture constraints on flow in
the systems they represent. However, many networked systems consist of agents
or components that exhibit multiple layers of interactions. Inevitably,
representing this intricate network of networks as a single aggregated network
leads to information loss and may obscure the actual organization. Here we
propose a method based on compression of network flows that can identify
modular flows in non-aggregated multilayer networks. Our numerical experiments
on synthetic networks show that the method can accurately identify modules that
cannot be identified in aggregated networks or by analyzing the layers
separately. We capitalize on our findings and reveal the community structure of
two multilayer collaboration networks: scientists affiliated to the Pierre
Auger Observatory and scientists publishing works on networks on the arXiv.
Compared to conventional aggregated methods, the multilayer method reveals
smaller modules with more overlap that better capture the actual organization
Consensus clustering in complex networks
The community structure of complex networks reveals both their organization
and hidden relationships among their constituents. Most community detection
methods currently available are not deterministic, and their results typically
depend on the specific random seeds, initial conditions and tie-break rules
adopted for their execution. Consensus clustering is used in data analysis to
generate stable results out of a set of partitions delivered by stochastic
methods. Here we show that consensus clustering can be combined with any
existing method in a self-consistent way, enhancing considerably both the
stability and the accuracy of the resulting partitions. This framework is also
particularly suitable to monitor the evolution of community structure in
temporal networks. An application of consensus clustering to a large citation
network of physics papers demonstrates its capability to keep track of the
birth, death and diversification of topics.Comment: 11 pages, 12 figures. Published in Scientific Report
- …