Search CORE

51 research outputs found

Conditions for rapid mixing of parallel and simulated tempering on multimodal distributions

Author: Huber Mark
Schmidler Scott C.
Woodard Dawn B.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

We give conditions under which a Markov chain constructed via parallel or simulated tempering is guaranteed to be rapidly mixing, which are applicable to a wide range of multimodal distributions arising in Bayesian statistical inference and statistical mechanics. We provide lower bounds on the spectral gaps of parallel and simulated tempering. These bounds imply a single set of sufficient conditions for rapid mixing of both techniques. A direct consequence of our results is rapid mixing of parallel and simulated tempering for several normal mixture models, and for the mean-field Ising model.Comment: Published in at http://dx.doi.org/10.1214/08-AAP555 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

DukeSpace

Bayesian model search and multilevel inference for SNP association studies

Author: A. Wilson
Edwin S. Iversen
Joellen
M. Schildkraut
Merlise A. Clyde
Scott C. Schmidler
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

Technological advances in genotyping have given rise to hypothesis-based association studies of increasing scope. As a result, the scientific hypotheses addressed by these studies have become more complex and more difficult to address using existing analytic methodologies. Obstacles to analysis include inference in the face of multiple comparisons, complications arising from correlations among the SNPs (single nucleotide polymorphisms), choice of their genetic parametrization and missing data. In this paper we present an efficient Bayesian model search strategy that searches over the space of genetic markers and their genetic parametrization. The resulting method for Multilevel Inference of SNP Associations, MISA, allows computation of multilevel posterior probabilities and Bayes factors at the global, gene and SNP level, with the prior distribution on SNP inclusion in the model providing an intrinsic multiplicity correction. We use simulated data sets to characterize MISA's statistical power, and show that MISA has higher power to detect association than standard procedures. Using data from the North Carolina Ovarian Cancer Study (NCOCS), MISA identifies variants that were not identified by standard methods and have been externally ``validated'' in independent studies. We examine sensitivity of the NCOCS results to prior choice and method for imputing missing data. MISA is available in an R package on CRAN.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS322 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Bayesian Segmentation of Protein Secondary Structure

Author: Asai K.
Douglas L. Brutlag
Heringa J.
Jun S. Liu
Krylov D.
Scott C. Schmidler
Solovyev V.V.
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

MCMC implementation for Bayesian hidden semi-Markov models with illustrative applications

Author: A. Gelman
A. Gelman
A.K. Jardine
C. Jouyaux
C. Yau
C.P. Robert
C.P. Robert
D. Gamerman
D.J. Spiegelhalter
E. Bellone
G. Celeux
H. Kozumi
J. Bulla
J. Bulla
J. Sansom
J.-M. Marin
J.D. Ferguson
J.P. Hughes
L. Rabiner
L.E. Baum
M. Dewar
M. Dong
M. Stephens
P. Fearnhead
P.A. Devijver
R Development Core Team
S. Chib
S. Guha
S. Richardson
S. Scott
S. Scott
S. Tokdar
S.-Z. Yu
S.C. Schmidler
S.E. Levinson
T. Economou
T. Economou
T. Rydén
Theodoros Economou
Trevor C. Bailey
W.R. Gilks
Y. Guedon
Y. Guedon
Y. Kleiner
Zoran Kapelan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/06/2014
Field of study

Copyright © Springer 2013. The final publication is available at Springer via http://dx.doi.org/10.1007/s11222-013-9399-zHidden Markov models (HMMs) are flexible, well established models useful in a diverse range of applications. However, one potential limitation of such models lies in their inability to explicitly structure the holding times of each hidden state. Hidden semi-Markov models (HSMMs) are more useful in the latter respect as they incorporate additional temporal structure by explicit modelling of the holding times. However, HSMMs have generally received less attention in the literature, mainly due to their intensive computational requirements. Here a Bayesian implementation of HSMMs is presented. Recursive algorithms are proposed in conjunction with Metropolis-Hastings in such a way as to avoid sampling from the distribution of the hidden state sequence in the MCMC sampler. This provides a computationally tractable estimation framework for HSMMs avoiding the limitations associated with the conventional EM algorithm regarding model flexibility. Performance of the proposed implementation is demonstrated through simulation experiments as well as an illustrative application relating to recurrent failures in a network of underground water pipes where random effects are also included into the HSMM to allow for pipe heterogeneity

Crossref

Open Research Exeter

Intergenic and Genic Sequence Lengths Have Opposite Relationships with Respect to Gene Expression

Author: A Bar-Even
A Gondor
A Taddei
AE Vinogradov
AE Vinogradov
AO Urrutia
Borislav Iordanov
C Seoighe
CE Nelson
CI Castillo-Davis
D Walther
DL Mace
E Eisenberg
F Chiaromonte
F Mignone
Gil Bohrer
H Le Hir
J Colinas
JC Pinheiro
JR Newman
JS Mattick
Juan Valcarcel
Juliette Colinas
JY Lee
K Birnbaum
M Gaszner
MP Levesque
MQ Zhang
Philip N. Benfey
S Cai
Scott C. Schmidler
SR Searle
T Nawy
XY Ren
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Eukaryotic genomes are mostly composed of noncoding DNA whose role is still poorly understood. Studies in several organisms have shown correlations between the length of the intergenic and genic sequences of a gene and the expression of its corresponding mRNA transcript. Some studies have found a positive relationship between intergenic sequence length and expression diversity between tissues, and concluded that genes under greater regulatory control require more regulatory information in their intergenic sequences. Other reports found a negative relationship between expression level and gene length and the interpretation was that there is selection pressure for highly expressed genes to remain small. However, a correlation between gene sequence length and expression diversity, opposite to that observed for intergenic sequences, has also been reported, and to date there is no testable explanation for this observation. To shed light on these varied and sometimes conflicting results, we performed a thorough study of the relationships between sequence length and gene expression using cell-type (tissue) specific microarray data in Arabidopsis thaliana. We measured median gene expression across tissues (expression level), expression variability between tissues (expression pattern uniformity), and expression variability between replicates (expression noise). We found that intergenic (upstream and downstream) and genic (coding and noncoding) sequences have generally opposite relationships with respect to expression, whether it is tissue variability, median, or expression noise. To explain these results we propose a model, in which the lengths of the intergenic and genic sequences have opposite effects on the ability of the transcribed region of the gene to be epigenetically regulated for differential expression. These findings could shed light on the role and influence of noncoding sequences on gene expression

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

DukeSpace