102 research outputs found

    Adaptive Bayesian Predictive Inference

    Full text link
    Bayesian predictive inference provides a coherent description of entire predictive uncertainty through predictive distributions. We examine several widely used sparsity priors from the predictive (as opposed to estimation) inference viewpoint. Our context is estimating a predictive distribution of a high-dimensional Gaussian observation with a known variance but an unknown sparse mean under the Kullback-Leibler loss. First, we show that LASSO (Laplace) priors are incapable of achieving rate-optimal performance. This new result contributes to the literature on negative findings about Bayesian LASSO posteriors. However, deploying the Laplace prior inside the Spike-and-Slab framework (for example with the Spike-and-Slab LASSO prior), rate-minimax performance can be attained with properly tuned parameters (depending on the sparsity level sn). We highlight the discrepancy between prior calibration for the purpose of prediction and estimation. Going further, we investigate popular hierarchical priors which are known to attain adaptive rate-minimax performance for estimation. Whether or not they are rate-minimax also for predictive inference has, until now, been unclear. We answer affirmatively by showing that hierarchical Spike-and-Slab priors are adaptive and attain the minimax rate without the knowledge of sn. This is the first rate-adaptive result in the literature on predictive density estimation in sparse setups. This finding celebrates benefits of fully Bayesian inference

    Simultaneous Variable and Covariance Selection with the Multivariate Spike-and-Slab Lasso

    Full text link
    We propose a Bayesian procedure for simultaneous variable and covariance selection using continuous spike-and-slab priors in multivariate linear regression models where q possibly correlated responses are regressed onto p predictors. Rather than relying on a stochastic search through the high-dimensional model space, we develop an ECM algorithm similar to the EMVS procedure of Rockova & George (2014) targeting modal estimates of the matrix of regression coefficients and residual precision matrix. Varying the scale of the continuous spike densities facilitates dynamic posterior exploration and allows us to filter out negligible regression coefficients and partial covariances gradually. Our method is seen to substantially outperform regularization competitors on simulated data. We demonstrate our method with a re-examination of data from a recent observational study of the effect of playing high school football on several later-life cognition, psychological, and socio-economic outcomes

    The Median Probability Model and Correlated Variables

    Full text link
    The median probability model (MPM) Barbieri and Berger (2004) is defined as the model consisting of those variables whose marginal posterior probability of inclusion is at least 0.5. The MPM rule yields the best single model for prediction in orthogonal and nested correlated designs. This result was originally conceived under a specific class of priors, such as the point mass mixtures of non-informative and g-type priors. The MPM rule, however, has become so very popular that it is now being deployed for a wider variety of priors and under correlated designs, where the properties of MPM are not yet completely understood. The main thrust of this work is to shed light on properties of MPM in these contexts by (a) characterizing situations when MPM is still safe under correlated designs, (b) providing significant generalizations of MPM to a broader class of priors (such as continuous spike-and-slab priors). We also provide new supporting evidence for the suitability of g-priors, as opposed to independent product priors, using new predictive matching arguments. Furthermore, we emphasize the importance of prior model probabilities and highlight the merits of non-uniform prior probability assignments using the notion of model aggregates

    The art of BART: On flexibility of Bayesian forests

    Full text link
    Considerable effort has been directed to developing asymptotically minimax procedures in problems of recovering functions and densities. These methods often rely on somewhat arbitrary and restrictive assumptions such as isotropy or spatial homogeneity. This work enhances theoretical understanding of Bayesian forests (including BART) under substantially relaxed smoothness assumptions. In particular, we provide a comprehensive study of asymptotic optimality and posterior contraction of Bayesian forests when the regression function has anisotropic smoothness that possibly varies over the function domain. We introduce a new class of sparse piecewise heterogeneous anisotropic H\"{o}lder functions and derive their minimax rate of estimation in high-dimensional scenarios under the L2L_2 loss. Next, we find that the default Bayesian CART prior, coupled with a subset selection prior for sparse estimation in high-dimensional scenarios, adapts to unknown heterogeneous smoothness and sparsity. These results show that Bayesian forests are uniquely suited for more general estimation problems which would render other default machine learning tools, such as Gaussian processes, suboptimal. Beyond nonparametric regression, we also show that Bayesian forests can be successfully applied to many other problems including density estimation and binary classification
    • …
    corecore