17,022 research outputs found
Consistency of Bayes factor for nonnested model selection when the model dimension grows
Zellner's -prior is a popular prior choice for the model selection
problems in the context of normal regression models. Wang and Sun [J. Statist.
Plann. Inference 147 (2014) 95-105] recently adopt this prior and put a special
hyper-prior for , which results in a closed-form expression of Bayes factor
for nested linear model comparisons. They have shown that under very general
conditions, the Bayes factor is consistent when two competing models are of
order for and for is almost consistent except
a small inconsistency region around the null hypothesis. In this paper, we
study Bayes factor consistency for nonnested linear models with a growing
number of parameters. Some of the proposed results generalize the ones of the
Bayes factor for the case of nested linear models. Specifically, we compare the
asymptotic behaviors between the proposed Bayes factor and the intrinsic Bayes
factor in the literature.Comment: Published at http://dx.doi.org/10.3150/15-BEJ720 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Consistency of objective Bayes factors as the model dimension grows
In the class of normal regression models with a finite number of regressors,
and for a wide class of prior distributions, a Bayesian model selection
procedure based on the Bayes factor is consistent [Casella and Moreno J. Amer.
Statist. Assoc. 104 (2009) 1261--1271]. However, in models where the number of
parameters increases as the sample size increases, properties of the Bayes
factor are not totally understood. Here we study consistency of the Bayes
factors for nested normal linear models when the number of regressors increases
with the sample size. We pay attention to two successful tools for model
selection [Schwarz Ann. Statist. 6 (1978) 461--464] approximation to the Bayes
factor, and the Bayes factor for intrinsic priors [Berger and Pericchi J. Amer.
Statist. Assoc. 91 (1996) 109--122, Moreno, Bertolino and Racugno J. Amer.
Statist. Assoc. 93 (1998) 1451--1460]. We find that the the Schwarz
approximation and the Bayes factor for intrinsic priors are consistent when the
rate of growth of the dimension of the bigger model is for . When
the Schwarz approximation is always inconsistent under the alternative
while the Bayes factor for intrinsic priors is consistent except for a small
set of alternative models which is characterized.Comment: Published in at http://dx.doi.org/10.1214/09-AOS754 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Make the most of your samples : Bayes factor estimators for high-dimensional models of sequence evolution
Background: Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model's marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes.
Results: We here assess the original 'model-switch' path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model's marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process.
Conclusions: We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation
Bayes factors and the geometry of discrete hierarchical loglinear models
A standard tool for model selection in a Bayesian framework is the Bayes
factor which compares the marginal likelihood of the data under two given
different models. In this paper, we consider the class of hierarchical
loglinear models for discrete data given under the form of a contingency table
with multinomial sampling. We assume that the Diaconis-Ylvisaker conjugate
prior is the prior distribution on the loglinear parameters and the uniform is
the prior distribution on the space of models. Under these conditions, the
Bayes factor between two models is a function of their prior and posterior
normalizing constants. These constants are functions of the hyperparameters
which can be interpreted respectively as marginal counts and the
total count of a fictive contingency table.
We study the behaviour of the Bayes factor when tends to zero. In
this study two mathematical objects play a most important role. They are,
first, the interior of the convex hull of the support of the
multinomial distribution for a given hierarchical loglinear model together with
its faces and second, the characteristic function of this convex
set .
We show that, when tends to 0, if the data lies on a face of
of dimension , the Bayes factor behaves like
. This implies in particular that when the data is in
and in , i.e. when equals the dimension of model , the sparser
model is favored, thus confirming the idea of Bayesian regularization.Comment: 37 page
- …