2,777 research outputs found
Link-Prediction Enhanced Consensus Clustering for Complex Networks
Many real networks that are inferred or collected from data are incomplete
due to missing edges. Missing edges can be inherent to the dataset (Facebook
friend links will never be complete) or the result of sampling (one may only
have access to a portion of the data). The consequence is that downstream
analyses that consume the network will often yield less accurate results than
if the edges were complete. Community detection algorithms, in particular,
often suffer when critical intra-community edges are missing. We propose a
novel consensus clustering algorithm to enhance community detection on
incomplete networks. Our framework utilizes existing community detection
algorithms that process networks imputed by our link prediction based
algorithm. The framework then merges their multiple outputs into a final
consensus output. On average our method boosts performance of existing
algorithms by 7% on artificial data and 17% on ego networks collected from
Facebook
Bayesian stochastic blockmodeling
This chapter provides a self-contained introduction to the use of Bayesian
inference to extract large-scale modular structures from network data, based on
the stochastic blockmodel (SBM), as well as its degree-corrected and
overlapping generalizations. We focus on nonparametric formulations that allow
their inference in a manner that prevents overfitting, and enables model
selection. We discuss aspects of the choice of priors, in particular how to
avoid underfitting via increased Bayesian hierarchies, and we contrast the task
of sampling network partitions from the posterior distribution with finding the
single point estimate that maximizes it, while describing efficient algorithms
to perform either one. We also show how inferring the SBM can be used to
predict missing and spurious links, and shed light on the fundamental
limitations of the detectability of modular structures in networks.Comment: 44 pages, 16 figures. Code is freely available as part of graph-tool
at https://graph-tool.skewed.de . See also the HOWTO at
https://graph-tool.skewed.de/static/doc/demos/inference/inference.htm
Element-centric clustering comparison unifies overlaps and hierarchy
Clustering is one of the most universal approaches for understanding complex
data. A pivotal aspect of clustering analysis is quantitatively comparing
clusterings; clustering comparison is the basis for many tasks such as
clustering evaluation, consensus clustering, and tracking the temporal
evolution of clusters. In particular, the extrinsic evaluation of clustering
methods requires comparing the uncovered clusterings to planted clusterings or
known metadata. Yet, as we demonstrate, existing clustering comparison measures
have critical biases which undermine their usefulness, and no measure
accommodates both overlapping and hierarchical clusterings. Here we unify the
comparison of disjoint, overlapping, and hierarchically structured clusterings
by proposing a new element-centric framework: elements are compared based on
the relationships induced by the cluster structure, as opposed to the
traditional cluster-centric philosophy. We demonstrate that, in contrast to
standard clustering similarity measures, our framework does not suffer from
critical biases and naturally provides unique insights into how the clusterings
differ. We illustrate the strengths of our framework by revealing new insights
into the organization of clusters in two applications: the improved
classification of schizophrenia based on the overlapping and hierarchical
community structure of fMRI brain networks, and the disentanglement of various
social homophily factors in Facebook social networks. The universality of
clustering suggests far-reaching impact of our framework throughout all areas
of science
A Method Based on Total Variation for Network Modularity Optimization using the MBO Scheme
The study of network structure is pervasive in sociology, biology, computer
science, and many other disciplines. One of the most important areas of network
science is the algorithmic detection of cohesive groups of nodes called
"communities". One popular approach to find communities is to maximize a
quality function known as {\em modularity} to achieve some sort of optimal
clustering of nodes. In this paper, we interpret the modularity function from a
novel perspective: we reformulate modularity optimization as a minimization
problem of an energy functional that consists of a total variation term and an
balance term. By employing numerical techniques from image processing
and compressive sensing -- such as convex splitting and the
Merriman-Bence-Osher (MBO) scheme -- we develop a variational algorithm for the
minimization problem. We present our computational results using both synthetic
benchmark networks and real data.Comment: 23 page
- …