31,137 research outputs found

    Free energy Sequential Monte Carlo, application to mixture modelling

    Full text link
    We introduce a new class of Sequential Monte Carlo (SMC) methods, which we call free energy SMC. This class is inspired by free energy methods, which originate from Physics, and where one samples from a biased distribution such that a given function ξ(θ)\xi(\theta) of the state θ\theta is forced to be uniformly distributed over a given interval. From an initial sequence of distributions (πt)(\pi_t) of interest, and a particular choice of ξ(θ)\xi(\theta), a free energy SMC sampler computes sequentially a sequence of biased distributions (π~t)(\tilde{\pi}_{t}) with the following properties: (a) the marginal distribution of ξ(θ)\xi(\theta) with respect to π~t\tilde{\pi}_{t} is approximatively uniform over a specified interval, and (b) π~t\tilde{\pi}_{t} and πt\pi_{t} have the same conditional distribution with respect to ξ\xi. We apply our methodology to mixture posterior distributions, which are highly multimodal. In the mixture context, forcing certain hyper-parameters to higher values greatly faciliates mode swapping, and makes it possible to recover a symetric output. We illustrate our approach with univariate and bivariate Gaussian mixtures and two real-world datasets.Comment: presented at "Bayesian Statistics 9" (Valencia meetings, 4-8 June 2010, Benidorm

    Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning

    Get PDF
    We propose a new sampling method, the thermostat-assisted continuously-tempered Hamiltonian Monte Carlo, for Bayesian learning on large datasets and multimodal distributions. It simulates the Nos\'e-Hoover dynamics of a continuously-tempered Hamiltonian system built on the distribution of interest. A significant advantage of this method is that it is not only able to efficiently draw representative i.i.d. samples when the distribution contains multiple isolated modes, but capable of adaptively neutralising the noise arising from mini-batches and maintaining accurate sampling. While the properties of this method have been studied using synthetic distributions, experiments on three real datasets also demonstrated the gain of performance over several strong baselines with various types of neural networks plunged in

    Two adaptive rejection sampling schemes for probability density functions log-convex tails

    Get PDF
    Monte Carlo methods are often necessary for the implementation of optimal Bayesian estimators. A fundamental technique that can be used to generate samples from virtually any target probability distribution is the so-called rejection sampling method, which generates candidate samples from a proposal distribution and then accepts them or not by testing the ratio of the target and proposal densities. The class of adaptive rejection sampling (ARS) algorithms is particularly interesting because they can achieve high acceptance rates. However, the standard ARS method can only be used with log-concave target densities. For this reason, many generalizations have been proposed. In this work, we investigate two different adaptive schemes that can be used to draw exactly from a large family of univariate probability density functions (pdf's), not necessarily log-concave, possibly multimodal and with tails of arbitrary concavity. These techniques are adaptive in the sense that every time a candidate sample is rejected, the acceptance rate is improved. The two proposed algorithms can work properly when the target pdf is multimodal, with first and second derivatives analytically intractable, and when the tails are log-convex in a infinite domain. Therefore, they can be applied in a number of scenarios in which the other generalizations of the standard ARS fail. Two illustrative numerical examples are shown

    Nonparametric Hierarchical Clustering of Functional Data

    Full text link
    In this paper, we deal with the problem of curves clustering. We propose a nonparametric method which partitions the curves into clusters and discretizes the dimensions of the curve points into intervals. The cross-product of these partitions forms a data-grid which is obtained using a Bayesian model selection approach while making no assumptions regarding the curves. Finally, a post-processing technique, aiming at reducing the number of clusters in order to improve the interpretability of the clustering, is proposed. It consists in optimally merging the clusters step by step, which corresponds to an agglomerative hierarchical classification whose dissimilarity measure is the variation of the criterion. Interestingly this measure is none other than the sum of the Kullback-Leibler divergences between clusters distributions before and after the merges. The practical interest of the approach for functional data exploratory analysis is presented and compared with an alternative approach on an artificial and a real world data set
    • …
    corecore