17,022 research outputs found

    Consistency of Bayes factor for nonnested model selection when the model dimension grows

    Full text link
    Zellner's gg-prior is a popular prior choice for the model selection problems in the context of normal regression models. Wang and Sun [J. Statist. Plann. Inference 147 (2014) 95-105] recently adopt this prior and put a special hyper-prior for gg, which results in a closed-form expression of Bayes factor for nested linear model comparisons. They have shown that under very general conditions, the Bayes factor is consistent when two competing models are of order O(nτ)O(n^{\tau}) for τ<1\tau <1 and for τ=1\tau=1 is almost consistent except a small inconsistency region around the null hypothesis. In this paper, we study Bayes factor consistency for nonnested linear models with a growing number of parameters. Some of the proposed results generalize the ones of the Bayes factor for the case of nested linear models. Specifically, we compare the asymptotic behaviors between the proposed Bayes factor and the intrinsic Bayes factor in the literature.Comment: Published at http://dx.doi.org/10.3150/15-BEJ720 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Consistency of objective Bayes factors as the model dimension grows

    Full text link
    In the class of normal regression models with a finite number of regressors, and for a wide class of prior distributions, a Bayesian model selection procedure based on the Bayes factor is consistent [Casella and Moreno J. Amer. Statist. Assoc. 104 (2009) 1261--1271]. However, in models where the number of parameters increases as the sample size increases, properties of the Bayes factor are not totally understood. Here we study consistency of the Bayes factors for nested normal linear models when the number of regressors increases with the sample size. We pay attention to two successful tools for model selection [Schwarz Ann. Statist. 6 (1978) 461--464] approximation to the Bayes factor, and the Bayes factor for intrinsic priors [Berger and Pericchi J. Amer. Statist. Assoc. 91 (1996) 109--122, Moreno, Bertolino and Racugno J. Amer. Statist. Assoc. 93 (1998) 1451--1460]. We find that the the Schwarz approximation and the Bayes factor for intrinsic priors are consistent when the rate of growth of the dimension of the bigger model is O(nb)O(n^b) for b<1b<1. When b=1b=1 the Schwarz approximation is always inconsistent under the alternative while the Bayes factor for intrinsic priors is consistent except for a small set of alternative models which is characterized.Comment: Published in at http://dx.doi.org/10.1214/09-AOS754 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Make the most of your samples : Bayes factor estimators for high-dimensional models of sequence evolution

    Get PDF
    Background: Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model's marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes. Results: We here assess the original 'model-switch' path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model's marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process. Conclusions: We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation

    Bayes factors and the geometry of discrete hierarchical loglinear models

    Full text link
    A standard tool for model selection in a Bayesian framework is the Bayes factor which compares the marginal likelihood of the data under two given different models. In this paper, we consider the class of hierarchical loglinear models for discrete data given under the form of a contingency table with multinomial sampling. We assume that the Diaconis-Ylvisaker conjugate prior is the prior distribution on the loglinear parameters and the uniform is the prior distribution on the space of models. Under these conditions, the Bayes factor between two models is a function of their prior and posterior normalizing constants. These constants are functions of the hyperparameters (m,α)(m,\alpha) which can be interpreted respectively as marginal counts and the total count of a fictive contingency table. We study the behaviour of the Bayes factor when α\alpha tends to zero. In this study two mathematical objects play a most important role. They are, first, the interior CC of the convex hull Cˉ\bar{C} of the support of the multinomial distribution for a given hierarchical loglinear model together with its faces and second, the characteristic function JC\mathbb{J}_C of this convex set CC. We show that, when α\alpha tends to 0, if the data lies on a face FiF_i of Ciˉ,i=1,2\bar{C_i},i=1,2 of dimension kik_i, the Bayes factor behaves like αk1−k2\alpha^{k_1-k_2}. This implies in particular that when the data is in C1C_1 and in C2C_2, i.e. when kik_i equals the dimension of model JiJ_i, the sparser model is favored, thus confirming the idea of Bayesian regularization.Comment: 37 page
    • …
    corecore