63,466 research outputs found
Learning multifractal structure in large networks
Generating random graphs to model networks has a rich history. In this paper,
we analyze and improve upon the multifractal network generator (MFNG)
introduced by Palla et al. We provide a new result on the probability of
subgraphs existing in graphs generated with MFNG. From this result it follows
that we can quickly compute moments of an important set of graph properties,
such as the expected number of edges, stars, and cliques. Specifically, we show
how to compute these moments in time complexity independent of the size of the
graph and the number of recursive levels in the generative model. We leverage
this theory to a new method of moments algorithm for fitting large networks to
MFNG. Empirically, this new approach effectively simulates properties of
several social and information networks. In terms of matching subgraph counts,
our method outperforms similar algorithms used with the Stochastic Kronecker
Graph model. Furthermore, we present a fast approximation algorithm to generate
graph instances following the multi- fractal structure. The approximation
scheme is an improvement over previous methods, which ran in time complexity
quadratic in the number of vertices. Combined, our method of moments and fast
sampling scheme provide the first scalable framework for effectively modeling
large networks with MFNG
Degree-based goodness-of-fit tests for heterogeneous random graph models : independent and exchangeable cases
The degrees are a classical and relevant way to study the topology of a
network. They can be used to assess the goodness-of-fit for a given random
graph model. In this paper we introduce goodness-of-fit tests for two classes
of models. First, we consider the case of independent graph models such as the
heterogeneous Erd\"os-R\'enyi model in which the edges have different
connection probabilities. Second, we consider a generic model for exchangeable
random graphs called the W-graph. The stochastic block model and the expected
degree distribution model fall within this framework. We prove the asymptotic
normality of the degree mean square under these independent and exchangeable
models and derive formal tests. We study the power of the proposed tests and we
prove the asymptotic normality under specific sparsity regimes. The tests are
illustrated on real networks from social sciences and ecology, and their
performances are assessed via a simulation study
An Ensemble Framework for Detecting Community Changes in Dynamic Networks
Dynamic networks, especially those representing social networks, undergo
constant evolution of their community structure over time. Nodes can migrate
between different communities, communities can split into multiple new
communities, communities can merge together, etc. In order to represent dynamic
networks with evolving communities it is essential to use a dynamic model
rather than a static one. Here we use a dynamic stochastic block model where
the underlying block model is different at different times. In order to
represent the structural changes expressed by this dynamic model the network
will be split into discrete time segments and a clustering algorithm will
assign block memberships for each segment. In this paper we show that using an
ensemble of clustering assignments accommodates for the variance in scalable
clustering algorithms and produces superior results in terms of
pairwise-precision and pairwise-recall. We also demonstrate that the dynamic
clustering produced by the ensemble can be visualized as a flowchart which
encapsulates the community evolution succinctly.Comment: 6 pages, under submission to HPEC Graph Challeng
Covariate-assisted spectral clustering
Biological and social systems consist of myriad interacting units. The
interactions can be represented in the form of a graph or network. Measurements
of these graphs can reveal the underlying structure of these interactions,
which provides insight into the systems that generated the graphs. Moreover, in
applications such as connectomics, social networks, and genomics, graph data
are accompanied by contextualizing measures on each node. We utilize these node
covariates to help uncover latent communities in a graph, using a modification
of spectral clustering. Statistical guarantees are provided under a joint
mixture model that we call the node-contextualized stochastic blockmodel,
including a bound on the mis-clustering rate. The bound is used to derive
conditions for achieving perfect clustering. For most simulated cases,
covariate-assisted spectral clustering yields results superior to regularized
spectral clustering without node covariates and to an adaptation of canonical
correlation analysis. We apply our clustering method to large brain graphs
derived from diffusion MRI data, using the node locations or neurological
region membership as covariates. In both cases, covariate-assisted spectral
clustering yields clusters that are easier to interpret neurologically.Comment: 28 pages, 4 figures, includes substantial changes to theoretical
result
Connectivity of Random Annulus Graphs and the Geometric Block Model
We provide new connectivity results for {\em vertex-random graphs} or {\em
random annulus graphs} which are significant generalizations of random
geometric graphs. Random geometric graphs (RGG) are one of the most basic
models of random graphs for spatial networks proposed by Gilbert in 1961,
shortly after the introduction of the Erd\H{o}s-R\'{en}yi random graphs. They
resemble social networks in many ways (e.g. by spontaneously creating cluster
of nodes with high modularity). The connectivity properties of RGG have been
studied since its introduction, and analyzing them has been significantly
harder than their Erd\H{o}s-R\'{en}yi counterparts due to correlated edge
formation.
Our next contribution is in using the connectivity of random annulus graphs
to provide necessary and sufficient conditions for efficient recovery of
communities for {\em the geometric block model} (GBM). The GBM is a
probabilistic model for community detection defined over an RGG in a similar
spirit as the popular {\em stochastic block model}, which is defined over an
Erd\H{o}s-R\'{en}yi random graph. The geometric block model inherits the
transitivity properties of RGGs and thus models communities better than a
stochastic block model. However, analyzing them requires fresh perspectives as
all prior tools fail due to correlation in edge formation. We provide a simple
and efficient algorithm that can recover communities in GBM exactly with high
probability in the regime of connectivity
- …