13 research outputs found
Variational Bayes model averaging for graphon functions and motif frequencies inference in W-graph models
W-graph refers to a general class of random graph models that can be seen as
a random graph limit. It is characterized by both its graphon function and its
motif frequencies. In this paper, relying on an existing variational Bayes
algorithm for the stochastic block models along with the corresponding weights
for model averaging, we derive an estimate of the graphon function as an
average of stochastic block models with increasing number of blocks. In the
same framework, we derive the variational posterior frequency of any motif. A
simulation study and an illustration on a social network complete our work
On sparsity, power-law and clustering properties of graphex processes
This paper investigates properties of the class of graphs based on
exchangeable point processes. We provide asymptotic expressions for the number
of edges, number of nodes and degree distributions, identifying four regimes:
(i) a dense regime, (ii) a sparse almost dense regime, (iii) a sparse regime
with power-law behaviour, and (iv) an almost extremely sparse regime. We show
that under mild assumptions, both the global and local clustering coefficients
converge to constants which may or may not be the same. We also derive a
central limit theorem for the number of nodes. Finally, we propose a class of
models within this framework where one can separately control the latent
structure and the global sparsity/power-law properties of the graph
Degree-based goodness-of-fit tests for heterogeneous random graph models : independent and exchangeable cases
The degrees are a classical and relevant way to study the topology of a
network. They can be used to assess the goodness-of-fit for a given random
graph model. In this paper we introduce goodness-of-fit tests for two classes
of models. First, we consider the case of independent graph models such as the
heterogeneous Erd\"os-R\'enyi model in which the edges have different
connection probabilities. Second, we consider a generic model for exchangeable
random graphs called the W-graph. The stochastic block model and the expected
degree distribution model fall within this framework. We prove the asymptotic
normality of the degree mean square under these independent and exchangeable
models and derive formal tests. We study the power of the proposed tests and we
prove the asymptotic normality under specific sparsity regimes. The tests are
illustrated on real networks from social sciences and ecology, and their
performances are assessed via a simulation study
Estimation of subgraph density in noisy networks
While it is common practice in applied network analysis to report various
standard network summary statistics, these numbers are rarely accompanied by
uncertainty quantification. Yet any error inherent in the measurements
underlying the construction of the network, or in the network construction
procedure itself, necessarily must propagate to any summary statistics
reported. Here we study the problem of estimating the density of an arbitrary
subgraph, given a noisy version of some underlying network as data. Under a
simple model of network error, we show that consistent estimation of such
densities is impossible when the rates of error are unknown and only a single
network is observed. Accordingly, we develop method-of-moment estimators of
network subgraph densities and error rates for the case where a minimal number
of network replicates are available. These estimators are shown to be
asymptotically normal as the number of vertices increases to infinity. We also
provide confidence intervals for quantifying the uncertainty in these estimates
based on the asymptotic normality. To construct the confidence intervals, a new
and non-standard bootstrap method is proposed to compute asymptotic variances,
which is infeasible otherwise. We illustrate the proposed methods in the
context of gene coexpression networks
Centrality measures for graphons: Accounting for uncertainty in networks
As relational datasets modeled as graphs keep increasing in size and their
data-acquisition is permeated by uncertainty, graph-based analysis techniques
can become computationally and conceptually challenging. In particular, node
centrality measures rely on the assumption that the graph is perfectly known --
a premise not necessarily fulfilled for large, uncertain networks. Accordingly,
centrality measures may fail to faithfully extract the importance of nodes in
the presence of uncertainty. To mitigate these problems, we suggest a
statistical approach based on graphon theory: we introduce formal definitions
of centrality measures for graphons and establish their connections to
classical graph centrality measures. A key advantage of this approach is that
centrality measures defined at the modeling level of graphons are inherently
robust to stochastic variations of specific graph realizations. Using the
theory of linear integral operators, we define degree, eigenvector, Katz and
PageRank centrality functions for graphons and establish concentration
inequalities demonstrating that graphon centrality functions arise naturally as
limits of their counterparts defined on sequences of graphs of increasing size.
The same concentration inequalities also provide high-probability bounds
between the graphon centrality functions and the centrality measures on any
sampled graph, thereby establishing a measure of uncertainty of the measured
centrality score. The same concentration inequalities also provide
high-probability bounds between the graphon centrality functions and the
centrality measures on any sampled graph, thereby establishing a measure of
uncertainty of the measured centrality score.Comment: Authors ordered alphabetically, all authors contributed equally. 21
pages, 7 figure