41 research outputs found

    Training samples in objective Bayesian model selection

    Full text link
    Central to several objective approaches to Bayesian model selection is the use of training samples (subsets of the data), so as to allow utilization of improper objective priors. The most common prescription for choosing training samples is to choose them to be as small as possible, subject to yielding proper posteriors; these are called minimal training samples. When data can vary widely in terms of either information content or impact on the improper priors, use of minimal training samples can be inadequate. Important examples include certain cases of discrete data, the presence of censored observations, and certain situations involving linear models and explanatory variables. Such situations require more sophisticated methods of choosing training samples. A variety of such methods are developed in this paper, and successfully applied in challenging situations

    Comparison Between Bayesian and Frequentist Tail Probability Estimates

    Full text link
    In this paper, we investigate the reasons that the Bayesian estimator of the tail probability is always higher than the frequentist estimator. Sufficient conditions for this phenomenon are established both by using Jensen's Inequality and by looking at Taylor series approximations, both of which point to the convexity of the distribution function

    Quick Anomaly Detection by the Newcomb--Benford Law, with Applications to Electoral Processes Data from the USA, Puerto Rico and Venezuela

    Full text link
    A simple and quick general test to screen for numerical anomalies is presented. It can be applied, for example, to electoral processes, both electronic and manual. It uses vote counts in officially published voting units, which are typically widely available and institutionally backed. The test examines the frequencies of digits on voting counts and rests on the First (NBL1) and Second Digit Newcomb--Benford Law (NBL2), and in a novel generalization of the law under restrictions of the maximum number of voters per unit (RNBL2). We apply the test to the 2004 USA presidential elections, the Puerto Rico (1996, 2000 and 2004) governor elections, the 2004 Venezuelan presidential recall referendum (RRP) and the previous 2000 Venezuelan Presidential election. The NBL2 is compellingly rejected only in the Venezuelan referendum and only for electronic voting units. Our original suggestion on the RRP (Pericchi and Torres, 2004) was criticized by The Carter Center report (2005). Acknowledging this, Mebane (2006) and The Economist (US) (2007) presented voting models and case studies in favor of NBL2. Further evidence is presented here. Moreover, under the RNBL2, Mebane's voting models are valid under wider conditions. The adequacy of the law is assessed through Bayes Factors (and corrections of pp-values) instead of significance testing, since for large sample sizes and fixed α\alpha levels the null hypothesis is over rejected. Our tests are extremely simple and can become a standard screening that a fair electoral process should pass.Comment: Published in at http://dx.doi.org/10.1214/09-STS296 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The matrix-F prior for estimating and testing covariance matrices

    Get PDF
    The matrix-F distribution is presented as prior for covariance matrices as an alternative to the conjugate inverted Wishart distribution. A special case of the univariate F distribution for a variance parameter is equivalent to a half-t distribution for a standard deviation, which is becoming increasingly popular in the Bayesian literature. The matrix-F distribution can be conveniently modeled as a Wishart mixture of Wishart or inverse Wishart distributions, which allows straightforward implementation in a Gibbs sampler. By mixing the covariance matrix of a multivariate normal distribution with a matrix-F distribution, a multivariate horseshoe type prior is obtained which is useful for modeling sparse signals. Furthermore, it is shown that the intrinsic prior for testing covariance matrices in non-hierarchical models has a matrix-F distribution. This intrinsic prior is also useful for testing inequality constrained hypotheses on variances. Finally through simulation it is shown that the matrix-variate F distribution has good frequentist properties as prior for the random effects covariance matrix in generalized linear mixed models

    A Robust Bayesian Dynamic Linear Model for Latin-American Economic Time Series: "The Mexico and Puerto Rico Cases"

    Full text link
    The traditional time series methodology requires at least a preliminary transformation of the data to get stationarity. On the other hand, Robust Bayesian Dynamic Models (RBDMs) do not assume a regular pattern or stability of the underlying system but can include points of statement breaks. In this paper we use RBDMs in order to account possible outliers and structural breaks in Latin-American economic time series. We work with important economic time series from Puerto Rico and Mexico. We show by using a random walk model how RBDMs can be applied for detecting historic changes in the economic inflation of Mexico. Also, we model the Consumer Price Index (CPI), the Economic Activity Index (EAI) and the total number of employments (TNE) economic time series in Puerto Rico using local linear trend and seasonal RBDMs with observational and states variances. The results illustrate how the model accounts the structural breaks for the historic recession periods in Puerto Rico

    Comparing Gaussian graphical models with the posterior predictive distribution and Bayesian model selection

    Get PDF
    Gaussian graphical models are commonly used to characterize conditional (in)dependence structures (i.e., partial correlation networks) of psychological constructs. Recently attention has shifted from estimating single networks to those from various subpopulations. The focus is primarily to detect differences or demonstrate replicability. We introduce two novel Bayesian methods for comparing networks that explicitly address these aims. The first is based on the posterior predictive distribution, with a symmetric version of Kullback-Leibler divergence as the discrepancy measure, that tests differences between two (or more) multivariate normal distributions. The second approach makes use of Bayesian model comparison, with the Bayes factor, and allows for gaining evidence for invariant network structures. This overcomes limitations of current approaches in the literature that use classical hypothesis testing, where it is only possible to determine whether groups are significantly different from each other. With simulation we show the posterior predictive method is approximately calibrated under the null hypothesis (alpha = .05) and has more power to detect differences than alternative approaches. We then examine the necessary sample sizes for detecting invariant network structures with Bayesian hypothesis testing, in addition to how this is influenced by the choice of prior distribution. The methods are applied to posttraumatic stress disorder symptoms that were measured in 4 groups. We end by summarizing our major contribution, that is proposing 2 novel methods for comparing Gaussian graphical models (GGMs), which extends beyond the social-behavioral sciences. The methods have been implemented in the R package BGGM. Translational Abstract Gaussian graphical models are becoming popular in the social-behavioral sciences. Recently attention has shifted from estimating single networks to those from various subpopulations (e.g., males vs. females). We introduce Bayesian methodology for comparing networks estimated from any number of groups. The first approach is based on the posterior predictive distribution and it allows for determining whether networks are different from one another. This is ideal for testing the null hypothesis of group equality, say, in the context of testing for network replicability (or lack thereof). The second approach is based on Bayesian hypothesis testing and it allows for gaining evidence for network invariances or equality of partial correlations for any number of groups. This is ideal for focusing on specific aspects of the network such as individual partial correlations. In a series of simulations and illustrative examples we demonstrate the utility of the proposed methodology for comparing Gaussian graphical models. The methods have been implemented in the R package BGGM

    Prior-based Bayesian information criterion

    Get PDF
    We present a new approach to model selection and Bayes factor determination, based on Laplace expansions (as in BIC), which we call Prior-based Bayes Information Criterion (PBIC). In this approach, the Laplace expansion is only done with the likelihood function, and then a suitable prior distribution is chosen to allow exact computation of the (approximate) marginal likelihood arising from the Laplace approximation and the prior. The result is a closed-form expression similar to BIC, but now involves a term arising from the prior distribution (which BIC ignores) and also incorporates the idea that different parameters can have different effective sample sizes (whereas BIC only allows one overall sample size n). We also consider a modification of PBIC which is more favourable to complex models
    corecore