161 research outputs found

    Using Hierarchical Centering to Facilitate a Reversible Jump MCMC Algorithm for Random Effects Models

    Get PDF
    The first author was supported by a studentship jointly funded by the University of St Andrews and EPSRC, through the National Centre for Statistical Ecology (EPSRC grant EP/C522702/1), with subsequent funding from EPSRC/NERC grant EP/I000917/1.Hierarchical centering has been described as a reparameterization method applicable to random effects models. It has been shown to improve mixing of models in the context of Markov chain Monte Carlo (MCMC) methods. A hierarchical centering approach is proposed for reversible jump MCMC (RJMCMC) chains which builds upon the hierarchical centering methods for MCMC chains and uses them to reparameterize models in an RJMCMC algorithm. Although these methods may be applicable to models with other error distributions, the case is described for a log-linear Poisson model where the expected value ΝΝ includes fixed effect covariates and a random effect for which normality is assumed with a zero-mean and unknown standard deviation. For the proposed RJMCMC algorithm including hierarchical centering, the models are reparameterized by modelling the mean of the random effect coefficients as a function of the intercept of the ΝΝ model and one or more of the available fixed effect covariates depending on the model. The method is appropriate when fixed-effect covariates are constant within random effect groups. This has an effect on the dynamics of the RJMCMC algorithm and improves model mixing. The methods are applied to a case study of point transects of indigo buntings where, without hierarchical centering, the RJMCMC algorithm had poor mixing and the estimated posterior distribution depended on the starting model. With hierarchical centering on the other hand, the chain moved freely over model and parameter space. These results are confirmed with a simulation study. Hence, the proposed methods should be considered as a regular strategy for implementing models with random effects in RJMCMC algorithms; they facilitate convergence of these algorithms and help avoid false inference on model parameters.PostprintPeer reviewe

    Bayesian hierarchical modelling of continuous non-negative longitudinal data with a spike at zero: an application to a study of birds visiting gardens in winter

    Get PDF
    The development of methods for dealing with continuous data with a spike at zero has lagged behind those for overdispersed or zero‐inflated count data. We consider longitudinal ecological data corresponding to an annual average of 26 weekly maximum counts of birds, and are hence effectively continuous, bounded below by zero but also with a discrete mass at zero. We develop a Bayesian hierarchical Tweedie regression model that can directly accommodate the excess number of zeros common to this type of data, whilst accounting for both spatial and temporal correlation. Implementation of the model is conducted in a Markov chain Monte Carlo (MCMC) framework, using reversible jump MCMC to explore uncertainty across both parameter and model spaces. This regression modelling framework is very flexible and removes the need to make strong assumptions about mean‐variance relationships a priori. It can also directly account for the spike at zero, whilst being easily applicable to other types of data and other model formulations. Whilst a correlative study such as this cannot prove causation, our results suggest that an increase in an avian predator may have led to an overall decrease in the number of one of its prey species visiting garden feeding stations in the United Kingdom. This may reflect a change in behaviour of house sparrows to avoid feeding stations frequented by sparrowhawks, or a reduction in house sparrow population size as a result of sparrowhawk increase

    Hierarchical models for semi-competing risks data with application to quality of end-of-life care for pancreatic cancer

    Full text link
    Readmission following discharge from an initial hospitalization is a key marker of quality of health care in the United States. For the most part, readmission has been used to study quality of care for patients with acute health conditions, such as pneumonia and heart failure, with analyses typically based on a logistic-Normal generalized linear mixed model. Applying this model to the study readmission among patients with increasingly prevalent advanced health conditions such as pancreatic cancer is problematic, however, because it ignores death as a competing risk. A more appropriate analysis is to imbed such studies within the semi-competing risks framework. To our knowledge, however, no comprehensive statistical methods have been developed for cluster-correlated semi-competing risks data. In this paper we propose a novel hierarchical modeling framework for the analysis of cluster-correlated semi-competing risks data. The framework permits parametric or non-parametric specifications for a range of model components, including baseline hazard functions and distributions for key random effects, giving analysts substantial flexibility as they consider their own analyses. Estimation and inference is performed within the Bayesian paradigm since it facilitates the straightforward characterization of (posterior) uncertainty for all model parameters including hospital-specific random effects. The proposed framework is used to study the risk of readmission among 5,298 Medicare beneficiaries diagnosed with pancreatic cancer at 112 hospitals in the six New England states between 2000-2009, specifically to investigate the role of patient-level risk factors and to characterize variation in risk across hospitals that is not explained by differences in patient case-mix

    A Bayesian Multivariate Functional Dynamic Linear Model

    Full text link
    We present a Bayesian approach for modeling multivariate, dependent functional data. To account for the three dominant structural features in the data--functional, time dependent, and multivariate components--we extend hierarchical dynamic linear models for multivariate time series to the functional data setting. We also develop Bayesian spline theory in a more general constrained optimization framework. The proposed methods identify a time-invariant functional basis for the functional observations, which is smooth and interpretable, and can be made common across multivariate observations for additional information sharing. The Bayesian framework permits joint estimation of the model parameters, provides exact inference (up to MCMC error) on specific parameters, and allows generalized dependence structures. Sampling from the posterior distribution is accomplished with an efficient Gibbs sampling algorithm. We illustrate the proposed framework with two applications: (1) multi-economy yield curve data from the recent global recession, and (2) local field potential brain signals in rats, for which we develop a multivariate functional time series approach for multivariate time-frequency analysis. Supplementary materials, including R code and the multi-economy yield curve data, are available online

    Scalable Importance Tempering and Bayesian Variable Selection

    Get PDF
    We propose a Monte Carlo algorithm to sample from high dimensional probability distributions that combines Markov chain Monte Carlo and importance sampling. We provide a careful theoretical analysis, including guarantees on robustness to high dimensionality, explicit comparison with standard Markov chain Monte Carlo methods and illustrations of the potential improvements in efficiency. Simple and concrete intuition is provided for when the novel scheme is expected to outperform standard schemes. When applied to Bayesian variable-selection problems, the novel algorithm is orders of magnitude more efficient than available alternative sampling schemes and enables fast and reliable fully Bayesian inferences with tens of thousand regressors.Comment: Online supplement not include

    Efficient Bayesian inference for partially observed stochastic epidemics and a new class of semi-parametric time series models

    Get PDF
    This thesis is divided in two distinct parts. In the First part we are concerned with developing new statistical methodology for drawing Bayesian inference for partially observed stochastic epidemic models. In the second part, we develop a novel methodology for constructing a wide class of semi-parametric time series models. First, we introduce a general framework for the heterogeneously mixing stochastic epidemic models (HMSE) and we also review some of the existing methods of statistical inference for epidemic models. The performance of a variety of centered Markov Chain Monte Carlo (MCMC) algorithms is studied. It is found that as the number of infected individuals increases, then the performance of these algorithms deteriorates. We then develop a variety of centered, non-centered and partially non-centered reparameterisations. We show that partially non-centered reparameterisations often offer more effcient MCMC algorithms than the centered ones. The methodology developed for drawing eciently Bayesian inference for HMSE is then applied to the 2001 UK Foot-and-Mouth disease outbreak in Cumbria. Unlike other existing modelling approaches, we model stochastically the infectious period of each farm assuming that the infection date of each farm is typically unknown. Due to the high dimensionality of the problem, standard MCMC algorithms are inefficient. Therefore, a partially non-centered algorithm is applied for the purpose of obtaining reliable estimates for the model's parameter of interest. In addition, we discuss similarities and differences of our fndings in comparison to other results in the literature. The main purpose of the second part of this thesis, is to develop a novel class of semi-parametric time series models. We are interested in constructing models for which we can specify in advance the marginal distribution of the observations and then build the dependence structure of the observations around them. First, we review current work concerning modelling time series with fixed non-Gaussian margins and various correlation structures. Then, we introduce a stochastic process which we term a latent branching tree (LBT). The LBT enables us to allow for a rich variety of correlation structures. Apart from discussing in detail the tree's properties, we also show how Bayesian inference can be carried out via MCMC methods. Various MCMC strategies are discussed including non-centered parameterisations. It is found that non-centered algorithms significantly improve the mixing of some of the algorithms based on centered reparameterisations. Finally, we present an application of this class of models to a real dataset on genome scheme data

    Clustering Partition Models for Discrete Structures with Applications in Geographical Epidemiology

    Get PDF
    This thesis is concerned with the analysis of data for a finite set of spatially structured units. For example, irregular structures, like political maps, are considered as well as regular lattices. The main field of application is geographical epidemiology. In this thesis a prior model for the use within a hierarchical Bayesian framework is developed, and a theoretical basis is given. The proposed partition model combines the units under investigation to clusters, and allows for the estimation of parameters on the basis of local information. Special emphasis is on spatially adaptive smoothing of the data that retains possible edges in the estimated surface. Information about the existence of such edges is extracted from the data. The investigation of different data types supports the suitability of the model for a wide range of applications. The model seems to be very flexible and shows the desired smoothing behavior. In comparison to commonly used Markov random field models the proposed model has some advantages. With respect to the quality of the data, either both models yield similar results, or the proposed model provides more clear structure in the estimates and simplifies the interpretation of the results.Diese Arbeit befaßt sich mit der Analyse von Daten, welche für endlich viele, räumlich strukturierte Einheiten vorliegen. Beispielsweise werden irreguläre Strukturen, wie politische Landkarten, oder auch reguläre Gitter betrachtet. Im Vordergrund stehen Anwendungen aus dem Bereich der geographischen Epidemiologie. In der Arbeit wird ein Priori-Modell zur Verwendung innerhalb eines hierarchischen Bayes-Ansatzes entwickelt und theoretisch fundiert. Das vorgeschlagene Partitionsmodell faßt die Beobachtungseinheiten zu Clustern zusammen und ermöglicht die Schätzung von Parametern anhand lokaler Information. Besonderes Augenmerk liegt hierbei auf der räumlich adaptiven Glättung der Daten, wodurch mögliche Kanten in der geschätzten Oberfläche erhalten bleiben können. Die Information über das Vorhandensein von Kanten wird dabei aus den Beobachtungen gewonnen. Eine Untersuchung verschiedener Datentypen belegt ein breites Anwendungsspektrum des Modells. Dabei erweist sich das Modell als sehr flexibel und es zeigen sich die erwünschten Glättungseigenschaften. Ein eingehender Vergleich mit in der Praxis häufig verwendeten Markov-Zufallsfeld-Modellen fällt positiv aus. In Abhängigkeit von der Qualität der Daten liefern beide Modelle entweder ähnliche Ergebnisse oder das in dieser Arbeit vorgeschlagene Modell bietet eine deutlichere Struktur in den Schätzungen und erleichtert somit die Interpretation der Ergebnisse

    Bayesian Regularization and Model Choice in Structured Additive Regression

    Get PDF
    • …
    corecore