2,731 research outputs found
Distilling importance sampling
The two main approaches to Bayesian inference are sampling and optimisation
methods. However many complicated posteriors are difficult to approximate by
either. Therefore we propose a novel approach combining features of both. We
use a flexible parameterised family of densities, such as a normalising flow.
Given a density from this family approximating the posterior, we use importance
sampling to produce a weighted sample from a more accurate posterior
approximation. This sample is then used in optimisation to update the
parameters of the approximate density, which we view as distilling the
importance sampling results. We iterate these steps and gradually improve the
quality of the posterior approximation. We illustrate our method in two
challenging examples: a queueing model and a stochastic differential equation
model.Comment: This version adds a second application, and fixes some minor error
Quantifying Model Uncertainty in Inverse Problems via Bayesian Deep Gradient Descent
Recent advances in reconstruction methods for inverse problems leverage
powerful data-driven models, e.g., deep neural networks. These techniques have
demonstrated state-of-the-art performances for several imaging tasks, but they
often do not provide uncertainty on the obtained reconstructions. In this work,
we develop a novel scalable data-driven knowledge-aided computational framework
to quantify the model uncertainty via Bayesian neural networks. The approach
builds on and extends deep gradient descent, a recently developed greedy
iterative training scheme, and recasts it within a probabilistic framework.
Scalability is achieved by being hybrid in the architecture: only the last
layer of each block is Bayesian, while the others remain deterministic, and by
being greedy in training. The framework is showcased on one representative
medical imaging modality, viz. computed tomography with either sparse view or
limited view data, and exhibits competitive performance with respect to
state-of-the-art benchmarks, e.g., total variation, deep gradient descent and
learned primal-dual.Comment: 8 pages, 6 figure
Adaptive MCMC for Bayesian variable selection in generalised linear models and survival models
Developing an efficient computational scheme for high-dimensional Bayesian
variable selection in generalised linear models and survival models has always
been a challenging problem due to the absence of closed-form solutions for the
marginal likelihood. The RJMCMC approach can be employed to samples model and
coefficients jointly, but effective design of the transdimensional jumps of
RJMCMC can be challenge, making it hard to implement. Alternatively, the
marginal likelihood can be derived using data-augmentation scheme e.g.
Polya-gamma data argumentation for logistic regression) or through other
estimation methods. However, suitable data-augmentation schemes are not
available for every generalised linear and survival models, and using
estimations such as Laplace approximation or correlated pseudo-marginal to
derive marginal likelihood within a locally informed proposal can be
computationally expensive in the "large n, large p" settings. In this paper,
three main contributions are presented. Firstly, we present an extended
Point-wise implementation of Adaptive Random Neighbourhood Informed proposal
(PARNI) to efficiently sample models directly from the marginal posterior
distribution in both generalised linear models and survival models. Secondly,
in the light of the approximate Laplace approximation, we also describe an
efficient and accurate estimation method for the marginal likelihood which
involves adaptive parameters. Additionally, we describe a new method to adapt
the algorithmic tuning parameters of the PARNI proposal by replacing the
Rao-Blackwellised estimates with the combination of a warm-start estimate and
an ergodic average. We present numerous numerical results from simulated data
and 8 high-dimensional gene fine mapping data-sets to showcase the efficiency
of the novel PARNI proposal compared to the baseline add-delete-swap proposal
Ensemble Transport Adaptive Importance Sampling
Markov chain Monte Carlo methods are a powerful and commonly used family of
numerical methods for sampling from complex probability distributions. As
applications of these methods increase in size and complexity, the need for
efficient methods increases. In this paper, we present a particle ensemble
algorithm. At each iteration, an importance sampling proposal distribution is
formed using an ensemble of particles. A stratified sample is taken from this
distribution and weighted under the posterior, a state-of-the-art ensemble
transport resampling method is then used to create an evenly weighted sample
ready for the next iteration. We demonstrate that this ensemble transport
adaptive importance sampling (ETAIS) method outperforms MCMC methods with
equivalent proposal distributions for low dimensional problems, and in fact
shows better than linear improvements in convergence rates with respect to the
number of ensemble members. We also introduce a new resampling strategy,
multinomial transformation (MT), which while not as accurate as the ensemble
transport resampler, is substantially less costly for large ensemble sizes, and
can then be used in conjunction with ETAIS for complex problems. We also focus
on how algorithmic parameters regarding the mixture proposal can be quickly
tuned to optimise performance. In particular, we demonstrate this methodology's
superior sampling for multimodal problems, such as those arising from inference
for mixture models, and for problems with expensive likelihoods requiring the
solution of a differential equation, for which speed-ups of orders of magnitude
are demonstrated. Likelihood evaluations of the ensemble could be computed in a
distributed manner, suggesting that this methodology is a good candidate for
parallel Bayesian computations
Evolving Deep DenseBlock Architecture Ensembles for Image Classification
Automatic deep architecture generation is a challenging task, owing to the large number of controlling parameters inherent in the construction of deep networks. The combination of these parameters leads to the creation of large, complex search spaces that are feasibly impossible to properly navigate without a huge amount of resources for parallelisation. To deal with such challenges, in this research we propose a Swarm Optimised DenseBlock Architecture Ensemble (SODBAE) method, a joint optimisation and training process that explores a constrained search space over a skeleton DenseBlock Convolutional Neural Network (CNN) architecture. Specifically, we employ novel weight inheritance learning mechanisms, a DenseBlock skeleton architecture, as well as adaptive Particle Swarm Optimisation (PSO) with cosine search coefficients to devise networks whilst maintaining practical computational costs. Moreover, the architecture design takes advantage of recent advancements of the concepts of residual connections and dense connectivity, in order to yield CNN models with a much wider variety of structural variations. The proposed weight inheritance learning schemes perform joint optimisation and training of the architectures to reduce the computational costs. Being evaluated using the CIFAR-10 dataset, the proposed model shows great superiority in classification performance over other state-of-the-art methods while illustrating a greater versatility in architecture generation
Accelerating Parallel Tempering: Quantile Tempering Algorithm (QuanTA)
Using MCMC to sample from a target distribution, on a
-dimensional state space can be a difficult and computationally expensive
problem. Particularly when the target exhibits multimodality, then the
traditional methods can fail to explore the entire state space and this results
in a bias sample output. Methods to overcome this issue include the parallel
tempering algorithm which utilises an augmented state space approach to help
the Markov chain traverse regions of low probability density and reach other
modes. This method suffers from the curse of dimensionality which dramatically
slows the transfer of mixing information from the auxiliary targets to the
target of interest as . This paper introduces a novel
prototype algorithm, QuanTA, that uses a Gaussian motivated transformation in
an attempt to accelerate the mixing through the temperature schedule of a
parallel tempering algorithm. This new algorithm is accompanied by a
comprehensive theoretical analysis quantifying the improved efficiency and
scalability of the approach; concluding that under weak regularity conditions
the new approach gives accelerated mixing through the temperature schedule.
Empirical evidence of the effectiveness of this new algorithm is illustrated on
canonical examples
- β¦