research

Bayesian Variable Selection for Ultrahigh-dimensional Sparse Linear Models

Abstract

We propose a Bayesian variable selection procedure for ultrahigh-dimensional linear regression models. The number of regressors involved in regression, pnp_n, is allowed to grow exponentially with nn. Assuming the true model to be sparse, in the sense that only a small number of regressors contribute to this model, we propose a set of priors suitable for this regime. The model selection procedure based on the proposed set of priors is shown to be variable selection consistent when all the 2pn2^{p_n} models are considered. In the ultrahigh-dimensional setting, selection of the true model among all the 2pn2^{p_n} possible ones involves prohibitive computation. To cope with this, we present a two-step model selection algorithm based on screening and Gibbs sampling. The first step of screening discards a large set of unimportant covariates, and retains a smaller set containing all the active covariates with probability tending to one. In the next step, we search for the best model among the covariates obtained in the screening step. This procedure is computationally quite fast, simple and intuitive. We demonstrate competitive performance of the proposed algorithm for a variety of simulated and real data sets when compared with several frequentist, as well as Bayesian methods

    Similar works

    Full text

    thumbnail-image

    Available Versions