36 research outputs found
Bayesian analysis of variable-order, reversible Markov chains
We define a conjugate prior for the reversible Markov chain of order . The
prior arises from a partially exchangeable reinforced random walk, in the same
way that the Beta distribution arises from the exchangeable Poly\'{a} urn. An
extension to variable-order Markov chains is also derived. We show the utility
of this prior in testing the order and estimating the parameters of a
reversible Markov model.Comment: Published in at http://dx.doi.org/10.1214/10-AOS857 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Bayesian nonparametric analysis of reversible Markov chains
We introduce a three-parameter random walk with reinforcement, called the
scheme, which generalizes the linearly edge reinforced
random walk to uncountable spaces. The parameter smoothly tunes the
scheme between this edge reinforced random walk and the
classical exchangeable two-parameter Hoppe urn scheme, while the parameters
and modulate how many states are typically visited. Resorting
to de Finetti's theorem for Markov chains, we use the
scheme to define a nonparametric prior for Bayesian analysis of reversible
Markov chains. The prior is applied in Bayesian nonparametric inference for
species sampling problems with data generated from a reversible Markov chain
with an unknown transition kernel. As a real example, we analyze data from
molecular dynamics simulations of protein folding.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1102 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Bayesian Regularization of the Length of Memory in Reversible Sequences
Summary
Variable order Markov chains have been used to model discrete sequential data in a variety of fields. A host of methods exist to estimate the history-dependent lengths of memory which characterize these models and to predict new sequences. In several applications, the data-generating mechanism is known to be reversible, but combining this information with the procedures mentioned is far from trivial. We introduce a Bayesian analysis for reversible dynamics, which takes into account uncertainty in the lengths of memory. The model proposed is applied to the analysis of molecular dynamics simulations and compared with several popular algorithms.SF is supported by the European Research Council through grant StG N-BNP 306406, LT has been supported by the Claudia Adams Barr Program in Innovative Cancer Research and SB received funding from the Stein Fellowship.This is the author accepted manuscript. The final version is available from Wiley via http://dx.doi.org/10.1111/rssb.1214
Bayesian regularization of the length of memory in reversible sequences
Variable order Markov chains have been used to model discrete sequential data in a variety of fields. A host of methods exist to estimate the history-dependent lengths of memory which characterize these models and to predict new sequences. In several applications, the data-generating mechanism is known to be reversible, but combining this information with the procedures mentioned is far from trivial. We introduce a Bayesian analysis for reversible dynamics, which takes into account uncertainty in the lengths of memory. The model proposed is applied to the analysis of molecular dynamics simulations and compared with several popular algorithms.SF is supported by the European Research Council through grant StG N-BNP 306406, LT has been supported by the Claudia Adams Barr Program in Innovative Cancer Research and SB received funding from the Stein Fellowship.This is the author accepted manuscript. The final version is available from Wiley via http://dx.doi.org/10.1111/rssb.1214
Tanimoto Random Features for Scalable Molecular Machine Learning
The Tanimoto coefficient is commonly used to measure the similarity between
molecules represented as discrete fingerprints, either as a distance metric or
a positive definite kernel. While many kernel methods can be accelerated using
random feature approximations, at present there is a lack of such
approximations for the Tanimoto kernel. In this paper we propose two kinds of
novel random features to allow this kernel to scale to large datasets, and in
the process discover a novel extension of the kernel to real-valued vectors. We
theoretically characterize these random features, and provide error bounds on
the spectral norm of the Gram matrix. Experimentally, we show that these random
features are effective at approximating the Tanimoto coefficient of real-world
datasets and are useful for molecular property prediction and optimization
tasks.Comment: Camera-ready version presented at NeurIPS 2023. Updates include:
notation changes, better description of features in section 4, updated
experiments, link to cod
Bayesian Nonparametric Ordination for the Analysis of Microbial Communities.
Human microbiome studies use sequencing technologies to measure the abundance of bacterial species or Operational Taxonomic Units (OTUs) in samples of biological material. Typically the data are organized in contingency tables with OTU counts across heterogeneous biological samples. In the microbial ecology community, ordination methods are frequently used to investigate latent factors or clusters that capture and describe variations of OTU counts across biological samples. It remains important to evaluate how uncertainty in estimates of each biological sample's microbial distribution propagates to ordination analyses, including visualization of clusters and projections of biological samples on low dimensional spaces. We propose a Bayesian analysis for dependent distributions to endow frequently used ordinations with estimates of uncertainty. A Bayesian nonparametric prior for dependent normalized random measures is constructed, which is marginally equivalent to the normalized generalized Gamma process, a well-known prior for nonparametric analyses. In our prior, the dependence and similarity between microbial distributions is represented by latent factors that concentrate in a low dimensional space. We use a shrinkage prior to tune the dimensionality of the latent factors. The resulting posterior samples of model parameters can be used to evaluate uncertainty in analyses routinely applied in microbiome studies. Specifically, by combining them with multivariate data analysis techniques we can visualize credible regions in ecological ordination plots. The characteristics of the proposed model are illustrated through a simulation study and applications in two microbiome datasets.B. Ren is supported by National Science Foundation under Grant No. DMS-1042785. S. Favaro is supported by the European Research Council (ERC) through StG N-BNP 306406. L. Trippa has been supported by the Claudia Adams Barr Program in Innovative Basic Cancer Research. S. Holmes was supported by the NIH grant R01AI112401
Recommended from our members
The effects of releasing early results from ongoing clinical trials.
Most trials do not release interim summaries on efficacy and toxicity of the experimental treatments being tested, with this information only released to the public after the trial has ended. While early release of clinical trial data to physicians and patients can inform enrollment decision making, it may also affect key operating characteristics of the trial, statistical validity and trial duration. We investigate the public release of early efficacy and toxicity results, during ongoing clinical studies, to better inform patients about their enrollment options. We use simulation models of phase II glioblastoma (GBM) clinical trials in which early efficacy and toxicity estimates are periodically released accordingly to a pre-specified protocol. Patients can use the reported interim efficacy and toxicity information, with the support of physicians, to decide which trial to enroll in. We describe potential effects on various operating characteristics, including the study duration, selection bias and power
Preclinical evaluation of (S)-[18F]GE387, a novel 18-kDa translocator protein (TSPO) PET radioligand with low binding sensitivity to human polymorphism rs6971.
Funder: Herchel Smith Fellowship programmePURPOSE: Positron emission tomography (PET) studies with radioligands for 18-kDa translocator protein (TSPO) have been instrumental in increasing our understanding of the complex role neuroinflammation plays in disorders affecting the brain. However, (R)-[11C]PK11195, the first and most widely used TSPO radioligand has limitations, while the next-generation TSPO radioligands have suffered from high interindividual variability in binding due to a genetic polymorphism in the TSPO gene (rs6971). Herein, we present the biological evaluation of the two enantiomers of [18F]GE387, which we have previously shown to have low sensitivity to this polymorphism. METHODS: Dynamic PET scans were conducted in male Wistar rats and female rhesus macaques to investigate the in vivo behaviour of (S)-[18F]GE387 and (R)-[18F]GE387. The specific binding of (S)-[18F]GE387 to TSPO was investigated by pre-treatment with (R)-PK11195. (S)-[18F]GE387 was further evaluated in a rat model of lipopolysaccharide (LPS)-induced neuroinflammation. Sensitivity to polymorphism of (S)-GE387 was evaluated in genotyped human brain tissue. RESULTS: (S)-[18F]GE387 and (R)-[18F]GE387 entered the brain in both rats and rhesus macaques. (R)-PK11195 blocked the uptake of (S)-[18F]GE387 in healthy olfactory bulb and peripheral tissues constitutively expressing TSPO. A 2.7-fold higher uptake of (S)-[18F]GE387 was found in the inflamed striatum of LPS-treated rodents. In genotyped human brain tissue, (S)-GE387 was shown to bind similarly in low affinity binders (LABs) and high affinity binders (HABs) with a LAB to HAB ratio of 1.8. CONCLUSION: We established that (S)-[18F]GE387 has favourable kinetics in healthy rats and non-human primates and that it can distinguish inflamed from normal brain regions in the LPS model of neuroinflammation. Crucially, we have reconfirmed its low sensitivity to the TSPO polymorphism on genotyped human brain tissue. Based on these factors, we conclude that (S)-[18F]GE387 warrants further evaluation with studies on human subjects to assess its suitability as a TSPO PET radioligand for assessing neuroinflammation