2,731 research outputs found

    Distilling importance sampling

    Full text link
    The two main approaches to Bayesian inference are sampling and optimisation methods. However many complicated posteriors are difficult to approximate by either. Therefore we propose a novel approach combining features of both. We use a flexible parameterised family of densities, such as a normalising flow. Given a density from this family approximating the posterior, we use importance sampling to produce a weighted sample from a more accurate posterior approximation. This sample is then used in optimisation to update the parameters of the approximate density, which we view as distilling the importance sampling results. We iterate these steps and gradually improve the quality of the posterior approximation. We illustrate our method in two challenging examples: a queueing model and a stochastic differential equation model.Comment: This version adds a second application, and fixes some minor error

    Quantifying Model Uncertainty in Inverse Problems via Bayesian Deep Gradient Descent

    Get PDF
    Recent advances in reconstruction methods for inverse problems leverage powerful data-driven models, e.g., deep neural networks. These techniques have demonstrated state-of-the-art performances for several imaging tasks, but they often do not provide uncertainty on the obtained reconstructions. In this work, we develop a novel scalable data-driven knowledge-aided computational framework to quantify the model uncertainty via Bayesian neural networks. The approach builds on and extends deep gradient descent, a recently developed greedy iterative training scheme, and recasts it within a probabilistic framework. Scalability is achieved by being hybrid in the architecture: only the last layer of each block is Bayesian, while the others remain deterministic, and by being greedy in training. The framework is showcased on one representative medical imaging modality, viz. computed tomography with either sparse view or limited view data, and exhibits competitive performance with respect to state-of-the-art benchmarks, e.g., total variation, deep gradient descent and learned primal-dual.Comment: 8 pages, 6 figure

    Adaptive MCMC for Bayesian variable selection in generalised linear models and survival models

    Full text link
    Developing an efficient computational scheme for high-dimensional Bayesian variable selection in generalised linear models and survival models has always been a challenging problem due to the absence of closed-form solutions for the marginal likelihood. The RJMCMC approach can be employed to samples model and coefficients jointly, but effective design of the transdimensional jumps of RJMCMC can be challenge, making it hard to implement. Alternatively, the marginal likelihood can be derived using data-augmentation scheme e.g. Polya-gamma data argumentation for logistic regression) or through other estimation methods. However, suitable data-augmentation schemes are not available for every generalised linear and survival models, and using estimations such as Laplace approximation or correlated pseudo-marginal to derive marginal likelihood within a locally informed proposal can be computationally expensive in the "large n, large p" settings. In this paper, three main contributions are presented. Firstly, we present an extended Point-wise implementation of Adaptive Random Neighbourhood Informed proposal (PARNI) to efficiently sample models directly from the marginal posterior distribution in both generalised linear models and survival models. Secondly, in the light of the approximate Laplace approximation, we also describe an efficient and accurate estimation method for the marginal likelihood which involves adaptive parameters. Additionally, we describe a new method to adapt the algorithmic tuning parameters of the PARNI proposal by replacing the Rao-Blackwellised estimates with the combination of a warm-start estimate and an ergodic average. We present numerous numerical results from simulated data and 8 high-dimensional gene fine mapping data-sets to showcase the efficiency of the novel PARNI proposal compared to the baseline add-delete-swap proposal

    Ensemble Transport Adaptive Importance Sampling

    Full text link
    Markov chain Monte Carlo methods are a powerful and commonly used family of numerical methods for sampling from complex probability distributions. As applications of these methods increase in size and complexity, the need for efficient methods increases. In this paper, we present a particle ensemble algorithm. At each iteration, an importance sampling proposal distribution is formed using an ensemble of particles. A stratified sample is taken from this distribution and weighted under the posterior, a state-of-the-art ensemble transport resampling method is then used to create an evenly weighted sample ready for the next iteration. We demonstrate that this ensemble transport adaptive importance sampling (ETAIS) method outperforms MCMC methods with equivalent proposal distributions for low dimensional problems, and in fact shows better than linear improvements in convergence rates with respect to the number of ensemble members. We also introduce a new resampling strategy, multinomial transformation (MT), which while not as accurate as the ensemble transport resampler, is substantially less costly for large ensemble sizes, and can then be used in conjunction with ETAIS for complex problems. We also focus on how algorithmic parameters regarding the mixture proposal can be quickly tuned to optimise performance. In particular, we demonstrate this methodology's superior sampling for multimodal problems, such as those arising from inference for mixture models, and for problems with expensive likelihoods requiring the solution of a differential equation, for which speed-ups of orders of magnitude are demonstrated. Likelihood evaluations of the ensemble could be computed in a distributed manner, suggesting that this methodology is a good candidate for parallel Bayesian computations

    Evolving Deep DenseBlock Architecture Ensembles for Image Classification

    Get PDF
    Automatic deep architecture generation is a challenging task, owing to the large number of controlling parameters inherent in the construction of deep networks. The combination of these parameters leads to the creation of large, complex search spaces that are feasibly impossible to properly navigate without a huge amount of resources for parallelisation. To deal with such challenges, in this research we propose a Swarm Optimised DenseBlock Architecture Ensemble (SODBAE) method, a joint optimisation and training process that explores a constrained search space over a skeleton DenseBlock Convolutional Neural Network (CNN) architecture. Specifically, we employ novel weight inheritance learning mechanisms, a DenseBlock skeleton architecture, as well as adaptive Particle Swarm Optimisation (PSO) with cosine search coefficients to devise networks whilst maintaining practical computational costs. Moreover, the architecture design takes advantage of recent advancements of the concepts of residual connections and dense connectivity, in order to yield CNN models with a much wider variety of structural variations. The proposed weight inheritance learning schemes perform joint optimisation and training of the architectures to reduce the computational costs. Being evaluated using the CIFAR-10 dataset, the proposed model shows great superiority in classification performance over other state-of-the-art methods while illustrating a greater versatility in architecture generation

    Accelerating Parallel Tempering: Quantile Tempering Algorithm (QuanTA)

    Get PDF
    Using MCMC to sample from a target distribution, Ο€(x)\pi(x) on a dd-dimensional state space can be a difficult and computationally expensive problem. Particularly when the target exhibits multimodality, then the traditional methods can fail to explore the entire state space and this results in a bias sample output. Methods to overcome this issue include the parallel tempering algorithm which utilises an augmented state space approach to help the Markov chain traverse regions of low probability density and reach other modes. This method suffers from the curse of dimensionality which dramatically slows the transfer of mixing information from the auxiliary targets to the target of interest as dβ†’βˆžd \rightarrow \infty. This paper introduces a novel prototype algorithm, QuanTA, that uses a Gaussian motivated transformation in an attempt to accelerate the mixing through the temperature schedule of a parallel tempering algorithm. This new algorithm is accompanied by a comprehensive theoretical analysis quantifying the improved efficiency and scalability of the approach; concluding that under weak regularity conditions the new approach gives accelerated mixing through the temperature schedule. Empirical evidence of the effectiveness of this new algorithm is illustrated on canonical examples
    • …
    corecore