162 research outputs found
Parameter estimation by implicit sampling
Implicit sampling is a weighted sampling method that is used in data
assimilation, where one sequentially updates estimates of the state of a
stochastic model based on a stream of noisy or incomplete data. Here we
describe how to use implicit sampling in parameter estimation problems, where
the goal is to find parameters of a numerical model, e.g.~a partial
differential equation (PDE), such that the output of the numerical model is
compatible with (noisy) data. We use the Bayesian approach to parameter
estimation, in which a posterior probability density describes the probability
of the parameter conditioned on data and compute an empirical estimate of this
posterior with implicit sampling. Our approach generates independent samples,
so that some of the practical difficulties one encounters with Markov Chain
Monte Carlo methods, e.g.~burn-in time or correlations among dependent samples,
are avoided. We describe a new implementation of implicit sampling for
parameter estimation problems that makes use of multiple grids (coarse to fine)
and BFGS optimization coupled to adjoint equations for the required gradient
calculations. The implementation is "dimension independent", in the sense that
a well-defined finite dimensional subspace is sampled as the mesh used for
discretization of the PDE is refined. We illustrate the algorithm with an
example where we estimate a diffusion coefficient in an elliptic equation from
sparse and noisy pressure measurements. In the example, dimension\slash
mesh-independence is achieved via Karhunen-Lo\`{e}ve expansions
A Bayesian approach for inferring neuronal connectivity from calcium fluorescent imaging data
Deducing the structure of neural circuits is one of the central problems of
modern neuroscience. Recently-introduced calcium fluorescent imaging methods
permit experimentalists to observe network activity in large populations of
neurons, but these techniques provide only indirect observations of neural
spike trains, with limited time resolution and signal quality. In this work we
present a Bayesian approach for inferring neural circuitry given this type of
imaging data. We model the network activity in terms of a collection of coupled
hidden Markov chains, with each chain corresponding to a single neuron in the
network and the coupling between the chains reflecting the network's
connectivity matrix. We derive a Monte Carlo Expectation--Maximization
algorithm for fitting the model parameters; to obtain the sufficient statistics
in a computationally-efficient manner, we introduce a specialized
blockwise-Gibbs algorithm for sampling from the joint activity of all observed
neurons given the observed fluorescence data. We perform large-scale
simulations of randomly connected neuronal networks with biophysically
realistic parameters and find that the proposed methods can accurately infer
the connectivity in these networks given reasonable experimental and
computational constraints. In addition, the estimation accuracy may be improved
significantly by incorporating prior knowledge about the sparseness of
connectivity in the network, via standard L penalization methods.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS303 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Trans-dimensional inverse problems, model comparison and the evidence
In most geophysical inverse problems the properties of interest are parametrized using a fixed number of unknowns. In some cases arguments can be used to bound the maximum number of parameters that need to be considered. In others the number of unknowns is set at some arbitrary value and regularization is used to encourage simple, non-extravagant models. In recent times variable or self-adaptive parametrizations have gained in popularity. Rarely, however, is the number of unknowns itself directly treated as an unknown. This situation leads to a transdimensional inverse problem, that is, one where the dimension of the parameter space is a variable to be solved for. This paper discusses trans-dimensional inverse problems from the Bayesian viewpoint. A particular type of Markov chain Monte Carlo (MCMC) sampling algorithm is highlighted which allows probabilistic sampling in variable dimension spaces. A quantity termed the evidence or marginal likelihood plays a key role in this type of problem. It is shown that once evidence calculations are performed, the results of complex variable dimension sampling algorithms can be replicated with simple and more familiar fixed dimensional MCMC sampling techniques. Numerical examples are used to illustrate the main points. The evidence can be difficult to calculate, especially in high-dimensional non-linear inverse problems. Nevertheless some general strategies are discussed and analytical expressions given for certain linear problem
Monte Carlo strategies for calibration in climate models
Intensive computational methods have been used by Earth scientists in a wide range of problems in data inversion and uncertainty quantification such as earthquake epicenter location and climate projections. To quantify the uncertainties resulting from a range of plausible model configurations it is necessary to estimate a multidimensional probability distribution. The computational cost of estimating these distributions for geoscience applications is impractical using traditional methods such as Metropolis/Gibbs algorithms as simulation costs limit the number of experiments that can be obtained reasonably. Several alternate sampling strategies have been proposed that could improve on the sampling efficiency including Multiple Very Fast Simulated Annealing (MVFSA) and Adaptive Metropolis algorithms. As a goal of this research, the performance of these proposed sampling strategies are evaluated with a surrogate climate model that is able to approximate the noise and response behavior of a realistic atmospheric general circulation model (AGCM). The surrogate model is fast enough that its evaluation can be embedded in these Monte Carlo algorithms. The goal of this thesis is to show that adaptive methods can be superior to MVFSA to approximate the known posterior distribution with fewer forward evaluations. However, the adaptive methods can also be limited by inadequate sample mixing. The Single Component and Delayed Rejection Adaptive Metropolis algorithms were found to resolve these limitations, although challenges remain to approximating multi-modal distributions. The results show that these advanced methods of statistical inference can provide practical solutions to the climate model calibration problem and challenges in quantifying climate projection uncertainties. The computational methods would also be useful to problems outside climate prediction, particularly those where sampling is limited by availability of computational resources
Statistical inference in generative models using scoring rules
Statistical models which allow generating simulations without providing access to the density of the distribution are called simulator models. They are commonly developed by scientists to represent natural phenomena and depend on physically meaningful parameters. Analogously, generative networks produce samples from a probability distribution by transforming draws from a noise (or latent) distribution via a neural network; as for simulator models, the density is unavailable. These two frameworks, developed independently from different communities, can be grouped into the class of generative models; compared to statistical models that explicitly specify the density, they are more powerful and flexible.
For generative networks, typically, a single point estimate for the parameters (or weights) is obtained by minimizing an objective function through gradient descent enabled by automatic differentiation. In contrast, for simulator models, samples from a probability distribution for the parameters are usually obtained via some statistical algorithm. Nevertheless, in both cases, the inference methods rely on common principles that exploit simulations. In this thesis, I follow the principle of assessing how a probabilistic model matches an observation by Scoring Rules. This generalises common statistical practices based on the density function and, with specific Scoring Rules, allows tackling generative models.
After a detailed introduction and literature review in Chapter 1, the first part of this thesis (Chapters 2 and 3) is concerned with methods to infer probability distributions for the parameters of simulator models. Specifically, Chapter 2 contributes to the traditional Bayesian Likelihood-Free Inference literature with a new way to learn summary statistics, defined as the sufficient statistics of the best exponential family approximation to the simulator model. In contrast, Chapter 3 departs from tradition by defining a new posterior distribution based on the generalised Bayesian inference framework, rather than motivated as an approximation to thestandard posterior. The posterior is defined through Scoring Rules computable for simulator models and is robust to outliers.
In the second part of the thesis (Chapters 4 and 5), I study Scoring Rule Minimization to determine the weights of generative networks; for specific choices of Scoring Rules, this approach better captures the variability of the data than popular alternatives. I apply generative networks trained in this way to uncertainty-sensitive tasks: in Chapter 4 I use them to provide a probability distribution over the parameters of simulator models, thus falling back to the theme of Chapters 2 and 3; instead, in Chapter 5, I consider probabilistic forecasting, also establishing consistency of the training objective with dependent training data.
Finally, I conclude in Chapter 6 with some final thoughts and directions for future work
- …