Search CORE

28 research outputs found

Bayesian model search and multilevel inference for SNP association studies

Author: A. Wilson
Edwin S. Iversen
Joellen
M. Schildkraut
Merlise A. Clyde
Scott C. Schmidler
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

Technological advances in genotyping have given rise to hypothesis-based association studies of increasing scope. As a result, the scientific hypotheses addressed by these studies have become more complex and more difficult to address using existing analytic methodologies. Obstacles to analysis include inference in the face of multiple comparisons, complications arising from correlations among the SNPs (single nucleotide polymorphisms), choice of their genetic parametrization and missing data. In this paper we present an efficient Bayesian model search strategy that searches over the space of genetic markers and their genetic parametrization. The resulting method for Multilevel Inference of SNP Associations, MISA, allows computation of multilevel posterior probabilities and Bayes factors at the global, gene and SNP level, with the prior distribution on SNP inclusion in the model providing an intrinsic multiplicity correction. We use simulated data sets to characterize MISA's statistical power, and show that MISA has higher power to detect association than standard procedures. Using data from the North Carolina Ovarian Cancer Study (NCOCS), MISA identifies variants that were not identified by standard methods and have been externally ``validated'' in independent studies. We examine sensitivity of the NCOCS results to prior choice and method for imputing missing data. MISA is available in an R package on CRAN.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS322 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

An Adaptive Interacting Wang-Landau Algorithm for Automatic Density Exploration

While statisticians are well-accustomed to performing exploratory analysis in the modeling stage of an analysis, the notion of conducting preliminary general-purpose exploratory analysis in the Monte Carlo stage (or more generally, the model-fitting stage) of an analysis is an area which we feel deserves much further attention. Towards this aim, this paper proposes a general-purpose algorithm for automatic density exploration. The proposed exploration algorithm combines and expands upon components from various adaptive Markov chain Monte Carlo methods, with the Wang-Landau algorithm at its heart. Additionally, the algorithm is run on interacting parallel chains -- a feature which both decreases computational cost as well as stabilizes the algorithm, improving its ability to explore the density. Performance is studied in several applications. Through a Bayesian variable selection example, the authors demonstrate the convergence gains obtained with interacting chains. The ability of the algorithm's adaptive proposal to induce mode-jumping is illustrated through a trimodal density and a Bayesian mixture modeling application. Lastly, through a 2D Ising model, the authors demonstrate the ability of the algorithm to overcome the high correlations encountered in spatial models.Comment: 33 pages, 20 figures (the supplementary materials are included as appendices

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

Crossref

INRIA a CCSD electronic archive server

Oxford University Research Archive

HAL-Polytechnique

Oskar Bordeaux

AHRQ series on complex intervention systematic reviews-paper 5: advanced analytic methods.

Author: Beretvas S Natasha
Devine Emily E Beth
Fonnesbeck Chris
Fu Rongwei
Kahwati Leila
Kelly Michael P
Morton Sally C
Murad M Hassan
Myers Evan
Noyes Jane
Petticrew Mark
Pigott Terri
Sanders-Schmidler Gillian D
Umscheid Craig A
Publication venue: J Clin Epidemiol
Publication date: 01/10/2017
Field of study

BACKGROUND AND OBJECTIVE: Advanced analytic methods for synthesizing evidence about complex interventions continue to be developed. In this paper, we emphasize that the specific research question posed in the review should be used as a guide for choosing the appropriate analytic method. METHODS: We present advanced analytic approaches that address four common questions that guide reviews of complex interventions: (1) How effective is the intervention? (2) For whom does the intervention work and in what contexts? (3) What happens when the intervention is implemented? and (4) What decisions are possible given the results of the synthesis? CONCLUSION: The analytic approaches presented in this paper are particularly useful when each primary study differs in components, mechanisms of action, context, implementation, timing, and many other domains

Crossref

LSHTM Research Online

Apollo (Cambridge)

Bangor University Research Portal

Bayesian Segmentation of Protein Secondary Structure

Author: Asai K.
Douglas L. Brutlag
Heringa J.
Jun S. Liu
Krylov D.
Scott C. Schmidler
Solovyev V.V.
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

MCMC implementation for Bayesian hidden semi-Markov models with illustrative applications

Author: A. Gelman
A. Gelman
A.K. Jardine
C. Jouyaux
C. Yau
C.P. Robert
C.P. Robert
D. Gamerman
D.J. Spiegelhalter
E. Bellone
G. Celeux
H. Kozumi
J. Bulla
J. Bulla
J. Sansom
J.-M. Marin
J.D. Ferguson
J.P. Hughes
L. Rabiner
L.E. Baum
M. Dewar
M. Dong
M. Stephens
P. Fearnhead
P.A. Devijver
R Development Core Team
S. Chib
S. Guha
S. Richardson
S. Scott
S. Scott
S. Tokdar
S.-Z. Yu
S.C. Schmidler
S.E. Levinson
T. Economou
T. Economou
T. Rydén
Theodoros Economou
Trevor C. Bailey
W.R. Gilks
Y. Guedon
Y. Guedon
Y. Kleiner
Zoran Kapelan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/06/2014
Field of study

Copyright © Springer 2013. The final publication is available at Springer via http://dx.doi.org/10.1007/s11222-013-9399-zHidden Markov models (HMMs) are flexible, well established models useful in a diverse range of applications. However, one potential limitation of such models lies in their inability to explicitly structure the holding times of each hidden state. Hidden semi-Markov models (HSMMs) are more useful in the latter respect as they incorporate additional temporal structure by explicit modelling of the holding times. However, HSMMs have generally received less attention in the literature, mainly due to their intensive computational requirements. Here a Bayesian implementation of HSMMs is presented. Recursive algorithms are proposed in conjunction with Metropolis-Hastings in such a way as to avoid sampling from the distribution of the hidden state sequence in the MCMC sampler. This provides a computationally tractable estimation framework for HSMMs avoiding the limitations associated with the conventional EM algorithm regarding model flexibility. Performance of the proposed implementation is demonstrated through simulation experiments as well as an illustrative application relating to recurrent failures in a network of underground water pipes where random effects are also included into the HSMM to allow for pipe heterogeneity

Crossref

Open Research Exeter

Intergenic and Genic Sequence Lengths Have Opposite Relationships with Respect to Gene Expression

Author: A Bar-Even
A Gondor
A Taddei
AE Vinogradov
AE Vinogradov
AO Urrutia
Borislav Iordanov
C Seoighe
CE Nelson
CI Castillo-Davis
D Walther
DL Mace
E Eisenberg
F Chiaromonte
F Mignone
Gil Bohrer
H Le Hir
J Colinas
JC Pinheiro
JR Newman
JS Mattick
Juan Valcarcel
Juliette Colinas
JY Lee
K Birnbaum
M Gaszner
MP Levesque
MQ Zhang
Philip N. Benfey
S Cai
Scott C. Schmidler
SR Searle
T Nawy
XY Ren
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Eukaryotic genomes are mostly composed of noncoding DNA whose role is still poorly understood. Studies in several organisms have shown correlations between the length of the intergenic and genic sequences of a gene and the expression of its corresponding mRNA transcript. Some studies have found a positive relationship between intergenic sequence length and expression diversity between tissues, and concluded that genes under greater regulatory control require more regulatory information in their intergenic sequences. Other reports found a negative relationship between expression level and gene length and the interpretation was that there is selection pressure for highly expressed genes to remain small. However, a correlation between gene sequence length and expression diversity, opposite to that observed for intergenic sequences, has also been reported, and to date there is no testable explanation for this observation. To shed light on these varied and sometimes conflicting results, we performed a thorough study of the relationships between sequence length and gene expression using cell-type (tissue) specific microarray data in Arabidopsis thaliana. We measured median gene expression across tissues (expression level), expression variability between tissues (expression pattern uniformity), and expression variability between replicates (expression noise). We found that intergenic (upstream and downstream) and genic (coding and noncoding) sequences have generally opposite relationships with respect to expression, whether it is tissue variability, median, or expression noise. To explain these results we propose a model, in which the lengths of the intergenic and genic sequences have opposite effects on the ability of the transcribed region of the gene to be epigenetically regulated for differential expression. These findings could shed light on the role and influence of noncoding sequences on gene expression

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

DukeSpace