74 research outputs found
Proximal Markov chain Monte Carlo algorithms
This paper presents a new Metropolis-adjusted Langevin algorithm (MALA) that uses convex analysis to simulate efficiently from high-dimensional densities that are log-concave, a class of probability distributions that is widely used in modern high-dimensional statistics and data analysis. The method is based on a new first-order approximation for Langevin diffusions that exploits log-concavity to construct Markov chains with favourable convergence properties. This approximation is closely related to Moreau--Yoshida regularisations for convex functions and uses proximity mappings instead of gradient mappings to approximate the continuous-time process. The proposed method complements existing MALA methods in two ways. First, the method is shown to have very robust stability properties and to converge geometrically for many target densities for which other MALA are not geometric, or only if the step size is sufficiently small. Second, the method can be applied to high-dimensional target densities that are not continuously differentiable, a class of distributions that is increasingly used in image processing and machine learning and that is beyond the scope of existing MALA and HMC algorithms. To use this method it is necessary to compute or to approximate efficiently the proximity mappings of the logarithm of the target density. For several popular models, including many Bayesian models used in modern signal and image processing and machine learning, this can be achieved with convex optimisation algorithms and with approximations based on proximal splitting techniques, which can be implemented in parallel. The proposed method is demonstrated on two challenging high-dimensional and non-differentiable models related to image resolution enhancement and low-rank matrix estimation that are not well addressed by existing MCMC methodology
Bayesian wavelet de-noising with the caravan prior
According to both domain expert knowledge and empirical evidence, wavelet
coefficients of real signals tend to exhibit clustering patterns, in that they
contain connected regions of coefficients of similar magnitude (large or
small). A wavelet de-noising approach that takes into account such a feature of
the signal may in practice outperform other, more vanilla methods, both in
terms of the estimation error and visual appearance of the estimates. Motivated
by this observation, we present a Bayesian approach to wavelet de-noising,
where dependencies between neighbouring wavelet coefficients are a priori
modelled via a Markov chain-based prior, that we term the caravan prior.
Posterior computations in our method are performed via the Gibbs sampler. Using
representative synthetic and real data examples, we conduct a detailed
comparison of our approach with a benchmark empirical Bayes de-noising method
(due to Johnstone and Silverman). We show that the caravan prior fares well and
is therefore a useful addition to the wavelet de-noising toolbox.Comment: 32 pages, 15 figures, 4 table
Oscillation of adaptative Metropolis-Hasting and simulated annealing algorithms around penalized least squares estimator
In this work we study, as the temperature goes to zero, the oscillation of
Metropolis-Hasting's algorithm around the Basis Pursuit De-noising solutions.
We derive new criteria for choosing the proposal distribution and the
temperature in Metropolis-Hasting's algorithm. Finally we apply these results
to compare Metropolis-Hasting's and simulated annealing algorithms
Bayesian linear regression with sparse priors
We study full Bayesian procedures for high-dimensional linear regression
under sparsity constraints. The prior is a mixture of point masses at zero and
continuous distributions. Under compatibility conditions on the design matrix,
the posterior distribution is shown to contract at the optimal rate for
recovery of the unknown sparse vector, and to give optimal prediction of the
response vector. It is also shown to select the correct sparse model, or at
least the coefficients that are significantly different from zero. The
asymptotic shape of the posterior distribution is characterized and employed to
the construction and study of credible sets for uncertainty quantification.Comment: Published at http://dx.doi.org/10.1214/15-AOS1334 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Gibbs Sampling using Anti-correlation Gaussian Data Augmentation, with Applications to L1-ball-type Models
L1-ball-type priors are a recent generalization of the spike-and-slab priors.
By transforming a continuous precursor distribution to the L1-ball boundary, it
induces exact zeros with positive prior and posterior probabilities. With great
flexibility in choosing the precursor and threshold distributions, we can
easily specify models under structured sparsity, such as those with dependent
probability for zeros and smoothness among the non-zeros. Motivated to
significantly accelerate the posterior computation, we propose a new data
augmentation that leads to a fast block Gibbs sampling algorithm. The latent
variable, named ``anti-correlation Gaussian'', cancels out the quadratic
exponent term in the latent Gaussian distribution, making the parameters of
interest conditionally independent so that they can be updated in a block.
Compared to existing algorithms such as the No-U-Turn sampler, the new blocked
Gibbs sampler has a very low computing cost per iteration and shows rapid
mixing of Markov chains. We establish the geometric ergodicity guarantee of the
algorithm in linear models. Further, we show useful extensions of our algorithm
for posterior estimation of general latent Gaussian models, such as those
involving multivariate truncated Gaussian or latent Gaussian process. Keywords:
Blocked Gibbs sampler; Fast Mixing of Markov Chains; Latent Gaussian Models;
Soft-thresholding
- …