15,712 research outputs found
statnet: Software Tools for the Representation, Visualization, Analysis and Simulation of Network Data
statnet is a suite of software packages for statistical network analysis. The packages implement recent advances in network modeling based on exponential-family random graph models (ERGM). The components of the package provide a comprehensive framework for ERGM-based network modeling, including tools for model estimation, model evaluation, model-based network simulation, and network visualization. This broad functionality is powered by a central Markov chain Monte Carlo (MCMC) algorithm. The coding is optimized for speed and robustness.
ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks
We describe some of the capabilities of the ergm package and the statistical theory underlying it. This package contains tools for accomplishing three important, and inter-related, tasks involving exponential-family random graph models (ERGMs): estimation, simulation, and goodness of fit. More precisely, ergm has the capability of approximating a maximum likelihood estimator for an ERGM given a network data set; simulating new network data sets from a fitted ERGM using Markov chain Monte Carlo; and assessing how well a fitted ERGM does at capturing characteristics of a particular network data set.
Noisy Hamiltonian Monte Carlo for doubly-intractable distributions
Hamiltonian Monte Carlo (HMC) has been progressively incorporated within the
statistician's toolbox as an alternative sampling method in settings when
standard Metropolis-Hastings is inefficient. HMC generates a Markov chain on an
augmented state space with transitions based on a deterministic differential
flow derived from Hamiltonian mechanics. In practice, the evolution of
Hamiltonian systems cannot be solved analytically, requiring numerical
integration schemes. Under numerical integration, the resulting approximate
solution no longer preserves the measure of the target distribution, therefore
an accept-reject step is used to correct the bias. For doubly-intractable
distributions -- such as posterior distributions based on Gibbs random fields
-- HMC suffers from some computational difficulties: computation of gradients
in the differential flow and computation of the accept-reject proposals poses
difficulty. In this paper, we study the behaviour of HMC when these quantities
are replaced by Monte Carlo estimates
Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects
Exponential-family random graph models (ERGMs) represent the processes that govern the formation of links in networks through the terms selected by the user. The terms specify network statistics that are sufficient to represent the probability distribution over the space of networks of that size. Many classes of statistics can be used. In this article we describe the classes of statistics that are currently available in the ergm package. We also describe means for controlling the Markov chain Monte Carlo (MCMC) algorithm that the package uses for estimation. These controls affect either the proposal distribution on the sample space used by the underlying Metropolis-Hastings algorithm or the constraints on the sample space itself. Finally, we describe various other arguments to core functions of the ergm package
Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects
Exponential-family random graph models (ERGMs) represent the processes that govern the formation of links in networks through the terms selected by the user. The terms specify network statistics that are sufficient to represent the probability distribution over the space of networks of that size. Many classes of statistics can be used. In this article we describe the classes of statistics that are currently available in the ergm package. We also describe means for controlling the Markov chain Monte Carlo (MCMC) algorithm that the package uses for estimation. These controls affect either the proposal distribution on the sample space used by the underlying Metropolis-Hastings algorithm or the constraints on the sample space itself. Finally, we describe various other arguments to core functions of the ergm package
Patterns of Scalable Bayesian Inference
Datasets are growing not just in size but in complexity, creating a demand
for rich models and quantification of uncertainty. Bayesian methods are an
excellent fit for this demand, but scaling Bayesian inference is a challenge.
In response to this challenge, there has been considerable recent work based on
varying assumptions about model structure, underlying computational resources,
and the importance of asymptotic correctness. As a result, there is a zoo of
ideas with few clear overarching principles.
In this paper, we seek to identify unifying principles, patterns, and
intuitions for scaling Bayesian inference. We review existing work on utilizing
modern computing resources with both MCMC and variational approximation
techniques. From this taxonomy of ideas, we characterize the general principles
that have proven successful for designing scalable inference procedures and
comment on the path forward
Efficient computational strategies for doubly intractable problems with applications to Bayesian social networks
Powerful ideas recently appeared in the literature are adjusted and combined
to design improved samplers for Bayesian exponential random graph models.
Different forms of adaptive Metropolis-Hastings proposals (vertical, horizontal
and rectangular) are tested and combined with the Delayed rejection (DR)
strategy with the aim of reducing the variance of the resulting Markov chain
Monte Carlo estimators for a given computational time. In the examples treated
in this paper the best combination, namely horizontal adaptation with delayed
rejection, leads to a variance reduction that varies between 92% and 144%
relative to the adaptive direction sampling approximate exchange algorithm of
Caimo and Friel (2011). These results correspond to an increased performance
which varies from 10% to 94% if we take simulation time into account. The
highest improvements are obtained when highly correlated posterior
distributions are considered.Comment: 23 pages, 8 figures. Accepted to appear in Statistics and Computin
- …