36 research outputs found

    Bayesian analysis of variable-order, reversible Markov chains

    Full text link
    We define a conjugate prior for the reversible Markov chain of order rr. The prior arises from a partially exchangeable reinforced random walk, in the same way that the Beta distribution arises from the exchangeable Poly\'{a} urn. An extension to variable-order Markov chains is also derived. We show the utility of this prior in testing the order and estimating the parameters of a reversible Markov model.Comment: Published in at http://dx.doi.org/10.1214/10-AOS857 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Bayesian nonparametric analysis of reversible Markov chains

    Get PDF
    We introduce a three-parameter random walk with reinforcement, called the (θ,α,β)(\theta,\alpha,\beta) scheme, which generalizes the linearly edge reinforced random walk to uncountable spaces. The parameter β\beta smoothly tunes the (θ,α,β)(\theta,\alpha,\beta) scheme between this edge reinforced random walk and the classical exchangeable two-parameter Hoppe urn scheme, while the parameters α\alpha and θ\theta modulate how many states are typically visited. Resorting to de Finetti's theorem for Markov chains, we use the (θ,α,β)(\theta,\alpha,\beta) scheme to define a nonparametric prior for Bayesian analysis of reversible Markov chains. The prior is applied in Bayesian nonparametric inference for species sampling problems with data generated from a reversible Markov chain with an unknown transition kernel. As a real example, we analyze data from molecular dynamics simulations of protein folding.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1102 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Bayesian Regularization of the Length of Memory in Reversible Sequences

    Get PDF
    Summary Variable order Markov chains have been used to model discrete sequential data in a variety of fields. A host of methods exist to estimate the history-dependent lengths of memory which characterize these models and to predict new sequences. In several applications, the data-generating mechanism is known to be reversible, but combining this information with the procedures mentioned is far from trivial. We introduce a Bayesian analysis for reversible dynamics, which takes into account uncertainty in the lengths of memory. The model proposed is applied to the analysis of molecular dynamics simulations and compared with several popular algorithms.SF is supported by the European Research Council through grant StG N-BNP 306406, LT has been supported by the Claudia Adams Barr Program in Innovative Cancer Research and SB received funding from the Stein Fellowship.This is the author accepted manuscript. The final version is available from Wiley via http://dx.doi.org/10.1111/rssb.1214

    Bayesian regularization of the length of memory in reversible sequences

    Get PDF
    Variable order Markov chains have been used to model discrete sequential data in a variety of fields. A host of methods exist to estimate the history-dependent lengths of memory which characterize these models and to predict new sequences. In several applications, the data-generating mechanism is known to be reversible, but combining this information with the procedures mentioned is far from trivial. We introduce a Bayesian analysis for reversible dynamics, which takes into account uncertainty in the lengths of memory. The model proposed is applied to the analysis of molecular dynamics simulations and compared with several popular algorithms.SF is supported by the European Research Council through grant StG N-BNP 306406, LT has been supported by the Claudia Adams Barr Program in Innovative Cancer Research and SB received funding from the Stein Fellowship.This is the author accepted manuscript. The final version is available from Wiley via http://dx.doi.org/10.1111/rssb.1214

    Tanimoto Random Features for Scalable Molecular Machine Learning

    Full text link
    The Tanimoto coefficient is commonly used to measure the similarity between molecules represented as discrete fingerprints, either as a distance metric or a positive definite kernel. While many kernel methods can be accelerated using random feature approximations, at present there is a lack of such approximations for the Tanimoto kernel. In this paper we propose two kinds of novel random features to allow this kernel to scale to large datasets, and in the process discover a novel extension of the kernel to real-valued vectors. We theoretically characterize these random features, and provide error bounds on the spectral norm of the Gram matrix. Experimentally, we show that these random features are effective at approximating the Tanimoto coefficient of real-world datasets and are useful for molecular property prediction and optimization tasks.Comment: Camera-ready version presented at NeurIPS 2023. Updates include: notation changes, better description of features in section 4, updated experiments, link to cod

    Bayesian Nonparametric Ordination for the Analysis of Microbial Communities.

    Get PDF
    Human microbiome studies use sequencing technologies to measure the abundance of bacterial species or Operational Taxonomic Units (OTUs) in samples of biological material. Typically the data are organized in contingency tables with OTU counts across heterogeneous biological samples. In the microbial ecology community, ordination methods are frequently used to investigate latent factors or clusters that capture and describe variations of OTU counts across biological samples. It remains important to evaluate how uncertainty in estimates of each biological sample's microbial distribution propagates to ordination analyses, including visualization of clusters and projections of biological samples on low dimensional spaces. We propose a Bayesian analysis for dependent distributions to endow frequently used ordinations with estimates of uncertainty. A Bayesian nonparametric prior for dependent normalized random measures is constructed, which is marginally equivalent to the normalized generalized Gamma process, a well-known prior for nonparametric analyses. In our prior, the dependence and similarity between microbial distributions is represented by latent factors that concentrate in a low dimensional space. We use a shrinkage prior to tune the dimensionality of the latent factors. The resulting posterior samples of model parameters can be used to evaluate uncertainty in analyses routinely applied in microbiome studies. Specifically, by combining them with multivariate data analysis techniques we can visualize credible regions in ecological ordination plots. The characteristics of the proposed model are illustrated through a simulation study and applications in two microbiome datasets.B. Ren is supported by National Science Foundation under Grant No. DMS-1042785. S. Favaro is supported by the European Research Council (ERC) through StG N-BNP 306406. L. Trippa has been supported by the Claudia Adams Barr Program in Innovative Basic Cancer Research. S. Holmes was supported by the NIH grant R01AI112401

    Preclinical evaluation of (S)-[18F]GE387, a novel 18-kDa translocator protein (TSPO) PET radioligand with low binding sensitivity to human polymorphism rs6971.

    Get PDF
    Funder: Herchel Smith Fellowship programmePURPOSE: Positron emission tomography (PET) studies with radioligands for 18-kDa translocator protein (TSPO) have been instrumental in increasing our understanding of the complex role neuroinflammation plays in disorders affecting the brain. However, (R)-[11C]PK11195, the first and most widely used TSPO radioligand has limitations, while the next-generation TSPO radioligands have suffered from high interindividual variability in binding due to a genetic polymorphism in the TSPO gene (rs6971). Herein, we present the biological evaluation of the two enantiomers of [18F]GE387, which we have previously shown to have low sensitivity to this polymorphism. METHODS: Dynamic PET scans were conducted in male Wistar rats and female rhesus macaques to investigate the in vivo behaviour of (S)-[18F]GE387 and (R)-[18F]GE387. The specific binding of (S)-[18F]GE387 to TSPO was investigated by pre-treatment with (R)-PK11195. (S)-[18F]GE387 was further evaluated in a rat model of lipopolysaccharide (LPS)-induced neuroinflammation. Sensitivity to polymorphism of (S)-GE387 was evaluated in genotyped human brain tissue. RESULTS: (S)-[18F]GE387 and (R)-[18F]GE387 entered the brain in both rats and rhesus macaques. (R)-PK11195 blocked the uptake of (S)-[18F]GE387 in healthy olfactory bulb and peripheral tissues constitutively expressing TSPO. A 2.7-fold higher uptake of (S)-[18F]GE387 was found in the inflamed striatum of LPS-treated rodents. In genotyped human brain tissue, (S)-GE387 was shown to bind similarly in low affinity binders (LABs) and high affinity binders (HABs) with a LAB to HAB ratio of 1.8. CONCLUSION: We established that (S)-[18F]GE387 has favourable kinetics in healthy rats and non-human primates and that it can distinguish inflamed from normal brain regions in the LPS model of neuroinflammation. Crucially, we have reconfirmed its low sensitivity to the TSPO polymorphism on genotyped human brain tissue. Based on these factors, we conclude that (S)-[18F]GE387 warrants further evaluation with studies on human subjects to assess its suitability as a TSPO PET radioligand for assessing neuroinflammation
    corecore