11,538 research outputs found
Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models
In the context of the expected-posterior prior (EPP) approach to Bayesian
variable selection in linear models, we combine ideas from power-prior and
unit-information-prior methodologies to simultaneously produce a
minimally-informative prior and diminish the effect of training samples. The
result is that in practice our power-expected-posterior (PEP) methodology is
sufficiently insensitive to the size n* of the training sample, due to PEP's
unit-information construction, that one may take n* equal to the full-data
sample size n and dispense with training samples altogether. In this paper we
focus on Gaussian linear models and develop our method under two different
baseline prior choices: the independence Jeffreys (or reference) prior,
yielding the J-PEP posterior, and the Zellner g-prior, leading to Z-PEP. We
find that, under the reference baseline prior, the asymptotics of PEP Bayes
factors are equivalent to those of Schwartz's BIC criterion, ensuring
consistency of the PEP approach to model selection. We compare the performance
of our method, in simulation studies and a real example involving prediction of
air-pollutant concentrations from meteorological covariates, with that of a
variety of previously-defined variants on Bayes factors for objective variable
selection. Our prior, due to its unit-information structure, leads to a
variable-selection procedure that (1) is systematically more parsimonious than
the basic EPP with minimal training sample, while sacrificing no desirable
performance characteristics to achieve this parsimony; (2) is robust to the
size of the training sample, thus enjoying the advantages described above
arising from the avoidance of training samples altogether; and (3) identifies
maximum-a-posteriori models that achieve good out-of-sample predictive
performance
Prior distributions for objective Bayesian analysis
We provide a review of prior distributions for objective Bayesian analysis. We start by examining some foundational issues and then organize our exposition into priors for: i) estimation or prediction; ii) model selection; iii) highdimensional models. With regard to i), we present some basic notions, and then move to more recent contributions on discrete parameter space, hierarchical models, nonparametric models, and penalizing complexity priors. Point ii) is the focus of this paper: it discusses principles for objective Bayesian model comparison, and singles out some major concepts for building priors, which are subsequently illustrated in some detail for the classic problem of variable selection in normal linear models. We also present some recent contributions in the area of objective priors on model space.With regard to point iii) we only provide a short summary of some default priors for high-dimensional models, a rapidly growing area of research
Training samples in objective Bayesian model selection
Central to several objective approaches to Bayesian model selection is the
use of training samples (subsets of the data), so as to allow utilization of
improper objective priors. The most common prescription for choosing training
samples is to choose them to be as small as possible, subject to yielding
proper posteriors; these are called minimal training samples.
When data can vary widely in terms of either information content or impact on
the improper priors, use of minimal training samples can be inadequate.
Important examples include certain cases of discrete data, the presence of
censored observations, and certain situations involving linear models and
explanatory variables. Such situations require more sophisticated methods of
choosing training samples. A variety of such methods are developed in this
paper, and successfully applied in challenging situations
Optimal predictive model selection
Often the goal of model selection is to choose a model for future prediction,
and it is natural to measure the accuracy of a future prediction by squared
error loss. Under the Bayesian approach, it is commonly perceived that the
optimal predictive model is the model with highest posterior probability, but
this is not necessarily the case. In this paper we show that, for selection
among normal linear models, the optimal predictive model is often the median
probability model, which is defined as the model consisting of those variables
which have overall posterior probability greater than or equal to 1/2 of being
in a model. The median probability model often differs from the highest
probability model
Mixtures of g-priors in Generalized Linear Models
Mixtures of Zellner's g-priors have been studied extensively in linear models
and have been shown to have numerous desirable properties for Bayesian variable
selection and model averaging. Several extensions of g-priors to Generalized
Linear Models (GLMs) have been proposed in the literature; however, the choice
of prior distribution of g and resulting properties for inference have received
considerably less attention. In this paper, we unify mixtures of g-priors in
GLMs by assigning the truncated Compound Confluent Hypergeometric (tCCH)
distribution to 1/(1 + g), which encompasses as special cases several mixtures
of g-priors in the literature, such as the hyper-g, Beta-prime, truncated
Gamma, incomplete inverse-Gamma, benchmark, robust, hyper-g/n, and intrinsic
priors. Through an integrated Laplace approximation, the posterior distribution
of 1/(1 + g) is in turn a tCCH distribution, and approximate marginal
likelihoods are thus available analytically, leading to "Compound
Hypergeometric Information Criteria" for model selection. We discuss the local
geometric properties of the g-prior in GLMs and show how the desiderata for
model selection proposed by Bayarri et al, such as asymptotic model selection
consistency, intrinsic consistency, and measurement invariance may be used to
justify the prior and specific choices of the hyper parameters. We illustrate
inference using these priors and contrast them to other approaches via
simulation and real data examples. The methodology is implemented in the R
package BAS and freely available on CRAN
Modeling Transport Mode Decisions Using Hierarchical Binary Spatial Regression Models with Cluster Effects
This work is motivated by a mobility study conducted in the city of Munich, Germany. The variable of interest is a binary response, which indicates whether public transport has been utilized or not. One of the central questions is to identify areas of low/high utilization of public transport after adjusting for explanatory factors such as trip, individual and household attributes. The goal is to develop flexible statistical models for a binary response with covariate, spatial and cluster effects. One approach for modeling spatial effects are Markov Random Fields (MRF). A modification of a class of MRF models with proper joint distributions introduced by Pettitt et al. (2002) is developed. This modification has the desirable property to contain the intrinsic MRF in the limit and still allows for efficient spatial parameter updates in Markov Chain Monte Carlo (MCMC) algorithms. In addition to spatial effects, cluster effects are taken into consideration. Group and individual approaches for modeling these effects are suggested. The first one models heterogeneity between clusters, while the second one models heterogeneity within clusters. A naive approach to include individual cluster effects results in an unidentifiable model. It is shown how an appropriate reparametrization gives identifiable parameters. This provides a new approach for modeling heterogeneity within clusters. For hierarchical spatial binary regression models with individual cluster effects two MCMC algorithms for parameter estimation are developed. The first one is based on a direct evaluation of the likelihood. The second one is based on the representation of binary responses with Gaussian latent variables through a threshold mechanism, which is particularly useful for probit models. Simulation results show a satisfactory behavior of the MCMC algorithms developed. Finally the proposed model classes are applied to the mobility study and results are interpreted
Exact Dimensionality Selection for Bayesian PCA
We present a Bayesian model selection approach to estimate the intrinsic
dimensionality of a high-dimensional dataset. To this end, we introduce a novel
formulation of the probabilisitic principal component analysis model based on a
normal-gamma prior distribution. In this context, we exhibit a closed-form
expression of the marginal likelihood which allows to infer an optimal number
of components. We also propose a heuristic based on the expected shape of the
marginal likelihood curve in order to choose the hyperparameters. In
non-asymptotic frameworks, we show on simulated data that this exact
dimensionality selection approach is competitive with both Bayesian and
frequentist state-of-the-art methods
- …