243 research outputs found
Reporting and Interpretation in Genome-Wide Association Studies
In the context of genome-wide association studies we critique a number of methods that have been suggested for flagging associations for further investigation. The p-value is by far the most commonly used measure, but requires careful calibration when the a priori probability of an association is small, and discards information by not considering the power associated with each test. The q-value is a frequentist method by which the false discovery rate (FDR) may be controlled. We advocate the use of the Bayes factor as a summary of the information in the data with respect to the comparison of the null and alternative hypotheses, and describe a recently-proposed approach to the calculation of the Bayes factor that is easily implemented. The combination of data across studies is straightforward using the Bayes factor approach, as are power calculations. The Bayes factor and the q-value provide complementary information and when used in addition to the p-value may be used to reduce the number of reported findings that are subsequently not reproduced
Restricted Covariance Priors with Applications in Spatial Statistics
We present a Bayesian model for area-level count data that uses Gaussian
random effects with a novel type of G-Wishart prior on the inverse
variance--covariance matrix. Specifically, we introduce a new distribution
called the truncated G-Wishart distribution that has support over precision
matrices that lead to positive associations between the random effects of
neighboring regions while preserving conditional independence of
non-neighboring regions. We describe Markov chain Monte Carlo sampling
algorithms for the truncated G-Wishart prior in a disease mapping context and
compare our results to Bayesian hierarchical models based on intrinsic
autoregression priors. A simulation study illustrates that using the truncated
G-Wishart prior improves over the intrinsic autoregressive priors when there
are discontinuities in the disease risk surface. The new model is applied to an
analysis of cancer incidence data in Washington State.Comment: Published at http://dx.doi.org/10.1214/14-BA927 in the Bayesian
Analysis (http://projecteuclid.org/euclid.ba) by the International Society of
Bayesian Analysis (http://bayesian.org/
A linear noise approximation for stochastic epidemic models fit to partially observed incidence counts
Stochastic epidemic models (SEMs) fit to incidence data are critical to
elucidating outbreak dynamics, shaping response strategies, and preparing for
future epidemics. SEMs typically represent counts of individuals in discrete
infection states using Markov jump processes (MJPs), but are computationally
challenging as imperfect surveillance, lack of subject-level information, and
temporal coarseness of the data obscure the true epidemic. Analytic integration
over the latent epidemic process is impossible, and integration via Markov
chain Monte Carlo (MCMC) is cumbersome due to the dimensionality and
discreteness of the latent state space. Simulation-based computational
approaches can address the intractability of the MJP likelihood, but are
numerically fragile and prohibitively expensive for complex models. A linear
noise approximation (LNA) that approximates the MJP transition density with a
Gaussian density has been explored for analyzing prevalence data in
large-population settings, but requires modification for analyzing incidence
counts without assuming that the data are normally distributed. We demonstrate
how to reparameterize SEMs to appropriately analyze incidence data, and fold
the LNA into a data augmentation MCMC framework that outperforms deterministic
methods, statistically, and simulation-based methods, computationally. Our
framework is computationally robust when the model dynamics are complex and
applies to a broad class of SEMs. We evaluate our method in simulations that
reflect Ebola, influenza, and SARS-CoV-2 dynamics, and apply our method to
national surveillance counts from the 2013--2015 West Africa Ebola outbreak
Efficient data augmentation for fitting stochastic epidemic models to prevalence data
Stochastic epidemic models describe the dynamics of an epidemic as a disease
spreads through a population. Typically, only a fraction of cases are observed
at a set of discrete times. The absence of complete information about the time
evolution of an epidemic gives rise to a complicated latent variable problem in
which the state space size of the epidemic grows large as the population size
increases. This makes analytically integrating over the missing data infeasible
for populations of even moderate size. We present a data augmentation Markov
chain Monte Carlo (MCMC) framework for Bayesian estimation of stochastic
epidemic model parameters, in which measurements are augmented with
subject-level disease histories. In our MCMC algorithm, we propose each new
subject-level path, conditional on the data, using a time-inhomogeneous
continuous-time Markov process with rates determined by the infection histories
of other individuals. The method is general, and may be applied, with minimal
modifications, to a broad class of stochastic epidemic models. We present our
algorithm in the context of multiple stochastic epidemic models in which the
data are binomially sampled prevalence counts, and apply our method to data
from an outbreak of influenza in a British boarding school
- …