331 research outputs found
Gibbs Variable Selection using BUGS
In this paper we discuss and present in detail the implementation of Gibbs variable selection as defined by Dellaportas et al. (2000, 2002) using the BUGS software (Spiegelhalter et al. ,'96a,b,c). The specification of the likelihood, prior and pseudo-prior distributions of the parameters as well as the prior term and model probabilities are described in detail. Guidance is also provided for the calculation of the posterior probabilities within BUGS environment when the number of models is limited. We illustrate the application of this methodology in a variety of problems including linear regression, log-linear and binomial response models.
Thermodynamic assessment of probability distribution divergencies and Bayesian model comparison
Within path sampling framework, we show that probability distribution
divergences, such as the Chernoff information, can be estimated via
thermodynamic integration. The Boltzmann-Gibbs distribution pertaining to
different Hamiltonians is implemented to derive tempered transitions along the
path, linking the distributions of interest at the endpoints. Under this
perspective, a geometric approach is feasible, which prompts intuition and
facilitates tuning the error sources. Additionally, there are direct
applications in Bayesian model evaluation. Existing marginal likelihood and
Bayes factor estimators are reviewed here along with their stepping-stone
sampling analogues. New estimators are presented and the use of compound paths
is introduced
Bivariate Poisson and Diagonal Inflated Bivariate Poisson Regression Models in R
In this paper we present an R package called bivpois for maximum likelihood estimation of the parameters of bivariate and diagonal inflated bivariate Poisson regression models. An Expectation-Maximization (EM) algorithm is implemented. Inflated models allow for modelling both over-dispersion (or under-dispersion) and negative correlation and thus they are appropriate for a wide range of applications. Extensions of the algorithms for several other models are also discussed. Detailed guidance and implementation on simulated and real data sets using bivpois package is provided.
Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models
In the context of the expected-posterior prior (EPP) approach to Bayesian
variable selection in linear models, we combine ideas from power-prior and
unit-information-prior methodologies to simultaneously produce a
minimally-informative prior and diminish the effect of training samples. The
result is that in practice our power-expected-posterior (PEP) methodology is
sufficiently insensitive to the size n* of the training sample, due to PEP's
unit-information construction, that one may take n* equal to the full-data
sample size n and dispense with training samples altogether. In this paper we
focus on Gaussian linear models and develop our method under two different
baseline prior choices: the independence Jeffreys (or reference) prior,
yielding the J-PEP posterior, and the Zellner g-prior, leading to Z-PEP. We
find that, under the reference baseline prior, the asymptotics of PEP Bayes
factors are equivalent to those of Schwartz's BIC criterion, ensuring
consistency of the PEP approach to model selection. We compare the performance
of our method, in simulation studies and a real example involving prediction of
air-pollutant concentrations from meteorological covariates, with that of a
variety of previously-defined variants on Bayes factors for objective variable
selection. Our prior, due to its unit-information structure, leads to a
variable-selection procedure that (1) is systematically more parsimonious than
the basic EPP with minimal training sample, while sacrificing no desirable
performance characteristics to achieve this parsimony; (2) is robust to the
size of the training sample, thus enjoying the advantages described above
arising from the avoidance of training samples altogether; and (3) identifies
maximum-a-posteriori models that achieve good out-of-sample predictive
performance
Bayesian variable selection using cost-adjusted BIC, with application to cost-effective measurement of quality of health care
In the field of quality of health care measurement, one approach to assessing
patient sickness at admission involves a logistic regression of mortality
within 30 days of admission on a fairly large number of sickness indicators (on
the order of 100) to construct a sickness scale, employing classical variable
selection methods to find an ``optimal'' subset of 10--20 indicators. Such
``benefit-only'' methods ignore the considerable differences among the sickness
indicators in cost of data collection, an issue that is crucial when admission
sickness is used to drive programs (now implemented or under consideration in
several countries, including the U.S. and U.K.) that attempt to identify
substandard hospitals by comparing observed and expected mortality rates (given
admission sickness). When both data-collection cost and accuracy of prediction
of 30-day mortality are considered, a large variable-selection problem arises
in which costly variables that do not predict well enough should be omitted
from the final scale. In this paper (a) we develop a method for solving this
problem based on posterior model odds, arising from a prior distribution that
(1) accounts for the cost of each variable and (2) results in a set of
posterior model probabilities that corresponds to a generalized cost-adjusted
version of the Bayesian information criterion (BIC), and (b) we compare this
method with a decision-theoretic cost-benefit approach based on maximizing
expected utility. We use reversible-jump Markov chain Monte Carlo (RJMCMC)
methods to search the model space, and we check the stability of our findings
with two variants of the MCMC model composition () algorithm.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS207 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Prior distributions for objective Bayesian analysis
We provide a review of prior distributions for objective Bayesian analysis. We start by examining some foundational issues and then organize our exposition into priors for: i) estimation or prediction; ii) model selection; iii) highdimensional models. With regard to i), we present some basic notions, and then move to more recent contributions on discrete parameter space, hierarchical models, nonparametric models, and penalizing complexity priors. Point ii) is the focus of this paper: it discusses principles for objective Bayesian model comparison, and singles out some major concepts for building priors, which are subsequently illustrated in some detail for the classic problem of variable selection in normal linear models. We also present some recent contributions in the area of objective priors on model space.With regard to point iii) we only provide a short summary of some default priors for high-dimensional models, a rapidly growing area of research
Concentration of personal and household crimes in England and Wales
Crime is disproportionally concentrated in few areas. Though long-established, there remains uncertainty about the reasons for variation in the concentration of similar crime (repeats) or different crime (multiples). Wholly neglected have been composite crimes when more than one crime types coincide as parts of a single event. The research reported here disentangles area crime concentration into repeats, multiple and composite crimes. The results are based on estimated bivariate zero-inflated Poisson regression models with covariance structure which explicitly account for crime rarity and crime concentration. The implications of the results for criminological theorizing and as a possible basis for more equitable police funding are discussed
- …