230 research outputs found
Misspecification in infinite-dimensional Bayesian statistics
We consider the asymptotic behavior of posterior distributions if the model is misspecified. Given a prior distribution and a random sample from a distribution P0, which may not be in the support of the prior, we show that the posterior concentrates its mass near the points in the support of the prior that minimize the Kullback–Leibler divergence with respect to P0. An entropy condition and a prior-mass condition determine the rate of convergence. The method is applied to several examples, with special interest for infinite-dimensional models. These include Gaussian mixtures, nonparametric regression and parametric models
Information rates of nonparametric Gaussian process methods
We consider the quality of learning a response function by a nonparametric Bayesian approach using a Gaussian process (GP) prior on the response function. We upper bound the quadratic risk of the learning procedure, which in turn is an upper bound on the Kullback-Leibler information between the predictive and true data distribution. The upper bound is expressed in small ball probabilities and concentration measures of the GP prior. We illustrate the computation of the upper bound for the Matérn and squared exponential kernels. For these priors the risk, and hence the information criterion, tends to zero for all continuous response functions. However, the rate at which this happens depends on the combination of true response function and Gaussian prior, and is expressible in a certain concentration function. In particular, the results show that for good performance, the regularity of the GP prior should match the regularity of the unknown response function
Bayesian inference with rescaled Gaussian process priors
We use rescaled Gaussian processes as prior models for functional parameters in nonparametric statistical models. We show how the rate of contraction of the posterior distributions depends on the scaling factor. In particular, we exhibit rescaled Gaussian process priors yielding posteriors that contract around the true parameter at optimal convergence rates. To derive our results we establish bounds on small deviation probabilities for smooth stationary Gaussian processes
Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences
We consider full Bayesian inference in the multivariate normal mean model in
the situation that the mean vector is sparse. The prior distribution on the
vector of means is constructed hierarchically by first choosing a collection of
nonzero means and next a prior on the nonzero values. We consider the posterior
distribution in the frequentist set-up that the observations are generated
according to a fixed mean vector, and are interested in the posterior
distribution of the number of nonzero components and the contraction of the
posterior distribution to the true mean vector. We find various combinations of
priors on the number of nonzero coefficients and on these coefficients that
give desirable performance. We also find priors that give suboptimal
convergence, for instance, Gaussian priors on the nonzero coefficients. We
illustrate the results by simulations.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1029 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Existence and consistency of maximum likelihood in upgraded mixture models
AbstractSuppose one observes a sample of size m from the mixture density ∫ p(x|z) dη(z) and a sample of size n from the distribution η. The kernel p(x|z) is known. We show existence of the maximum likelihood estimator for η, characterize its support, and prove consistency as m, n → ∞
Convergence rates of posterior distributions.
We consider the asymptotic behavior of posterior distributions and Bayes estimators for infinite-dimensional statistical models. We give general results on the rate of convergence of the posterior measure. These are applied to several examples, including priors on finite sieves, log-spline models, Dirichlet processes and interval censoring
Honest Bayesian confidence sets for the L2-norm
We investigate the problem of constructing Bayesian credible sets that are honest and adaptive for the L2-loss over a scale of Sobolev classes with regularity ranging between [D; 2D], for some given D in the context of the signal-in-white-noise model. We consider a scale of prior distributions indexed by a regularity hyper-parameter and choose the hyper-parameter both by marginal likelihood empirical Bayes and by hierarchical Bayes method, respectively. Next we consider a ball centered around the corresponding posterior mean with prescribed posterior probability. We show by theory and examples that both the empirical Bayes and the hierarchical Bayes credible sets give misleading, overconfident uncertainty quantification for certain oddly behaving truth. Then we construct a new empirical Bayes method based on risk estimation, which provides the correct uncertainty quantification and optimal size
- …