239 research outputs found
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
Recommended from our members
A Survey of Statistical Network Models
Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active ânetwork communityâ and a substantial liter- ature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning net- work literature in statistical physics and computer science. The growthof the World Wide Web and the emergence of online ânetworking com- munitiesâ such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize for- mal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.Statistic
Stochastic blockmodels with growing number of classes
We present asymptotic and finite-sample results on the use of stochastic
blockmodels for the analysis of network data. We show that the fraction of
misclassified network nodes converges in probability to zero under maximum
likelihood fitting when the number of classes is allowed to grow as the root of
the network size and the average network degree grows at least
poly-logarithmically in this size. We also establish finite-sample confidence
bounds on maximum-likelihood blockmodel parameter estimates from data
comprising independent Bernoulli random variates; these results hold uniformly
over class assignment. We provide simulations verifying the conditions
sufficient for our results, and conclude by fitting a logit parameterization of
a stochastic blockmodel with covariates to a network data example comprising a
collection of Facebook profiles, resulting in block estimates that reveal
residual structure.Comment: 12 pages, 3 figures; revised versio
Link Prediction in Complex Networks: A Survey
Link prediction in complex networks has attracted increasing attention from
both physical and computer science communities. The algorithms can be used to
extract missing information, identify spurious interactions, evaluate network
evolving mechanisms, and so on. This article summaries recent progress about
link prediction algorithms, emphasizing on the contributions from physical
perspectives and approaches, such as the random-walk-based methods and the
maximum likelihood methods. We also introduce three typical applications:
reconstruction of networks, evaluation of network evolving mechanism and
classification of partially labelled networks. Finally, we introduce some
applications and outline future challenges of link prediction algorithms.Comment: 44 pages, 5 figure
Nonparametric Bayes dynamic modeling of relational data
Symmetric binary matrices representing relations among entities are commonly
collected in many areas. Our focus is on dynamically evolving binary relational
matrices, with interest being in inference on the relationship structure and
prediction. We propose a nonparametric Bayesian dynamic model, which reduces
dimensionality in characterizing the binary matrix through a lower-dimensional
latent space representation, with the latent coordinates evolving in continuous
time via Gaussian processes. By using a logistic mapping function from the
probability matrix space to the latent relational space, we obtain a flexible
and computational tractable formulation. Employing P\`olya-Gamma data
augmentation, an efficient Gibbs sampler is developed for posterior
computation, with the dimension of the latent space automatically inferred. We
provide some theoretical results on flexibility of the model, and illustrate
performance via simulation experiments. We also consider an application to
co-movements in world financial markets
Detection of Epigenomic Network Community Oncomarkers
In this paper we propose network methodology to infer prognostic cancer
biomarkers based on the epigenetic pattern DNA methylation. Epigenetic
processes such as DNA methylation reflect environmental risk factors, and are
increasingly recognised for their fundamental role in diseases such as cancer.
DNA methylation is a gene-regulatory pattern, and hence provides a means by
which to assess genomic regulatory interactions. Network models are a natural
way to represent and analyse groups of such interactions. The utility of
network models also increases as the quantity of data and number of variables
increase, making them increasingly relevant to large-scale genomic studies. We
propose methodology to infer prognostic genomic networks from a DNA
methylation-based measure of genomic interaction and association. We then show
how to identify prognostic biomarkers from such networks, which we term
`network community oncomarkers'. We illustrate the power of our proposed
methodology in the context of a large publicly available breast cancer dataset
- âŚ