90 research outputs found

    Particle MCMC algorithms and architectures for accelerating inference in state-space models.

    Get PDF
    Particle Markov Chain Monte Carlo (pMCMC) is a stochastic algorithm designed to generate samples from a probability distribution, when the density of the distribution does not admit a closed form expression. pMCMC is most commonly used to sample from the Bayesian posterior distribution in State-Space Models (SSMs), a class of probabilistic models used in numerous scientific applications. Nevertheless, this task is prohibitive when dealing with complex SSMs with massive data, due to the high computational cost of pMCMC and its poor performance when the posterior exhibits multi-modality. This paper aims to address both issues by: 1) Proposing a novel pMCMC algorithm (denoted ppMCMC), which uses multiple Markov chains (instead of the one used by pMCMC) to improve sampling efficiency for multi-modal posteriors, 2) Introducing custom, parallel hardware architectures, which are tailored for pMCMC and ppMCMC. The architectures are implemented on Field Programmable Gate Arrays (FPGAs), a type of hardware accelerator with massive parallelization capabilities. The new algorithm and the two FPGA architectures are evaluated using a large-scale case study from genetics. Results indicate that ppMCMC achieves 1.96x higher sampling efficiency than pMCMC when using sequential CPU implementations. The FPGA architecture of pMCMC is 12.1x and 10.1x faster than state-of-the-art, parallel CPU and GPU implementations of pMCMC and up to 53x more energy efficient; the FPGA architecture of ppMCMC increases these speedups to 34.9x and 41.8x respectively and is 173x more power efficient, bringing previously intractable SSM-based data analyses within reach.The authors would like to thank the Wellcome Trust (Grant reference 097816/Z/11/A) and the EPSRC (Grant reference EP/I012036/1) for the financial support given to this research project

    High-throughput multivariable Mendelian randomization analysis prioritizes apolipoprotein B as key lipid risk factor for coronary artery disease.

    Get PDF
    BACKGROUND: Genetic variants can be used to prioritize risk factors as potential therapeutic targets via Mendelian randomization (MR). An agnostic statistical framework using Bayesian model averaging (MR-BMA) can disentangle the causal role of correlated risk factors with shared genetic predictors. Here, our objective is to identify lipoprotein measures as mediators between lipid-associated genetic variants and coronary artery disease (CAD) for the purpose of detecting therapeutic targets for CAD. METHODS: As risk factors we consider 30 lipoprotein measures and metabolites derived from a high-throughput metabolomics study including 24 925 participants. We fit multivariable MR models of genetic associations with CAD estimated in 453 595 participants (including 113 937 cases) regressed on genetic associations with the risk factors. MR-BMA assigns to each combination of risk factors a model score quantifying how well the genetic associations with CAD are explained. Risk factors are ranked by their marginal score and selected using false-discovery rate (FDR) criteria. We perform supplementary and sensitivity analyses varying the dataset for genetic associations with CAD. RESULTS: In the main analysis, the top combination of risk factors ranked by the model score contains apolipoprotein B (ApoB) only. ApoB is also the highest ranked risk factor with respect to the marginal score (FDR <0.005). Additionally, ApoB is selected in all sensitivity analyses. No other measure of cholesterol or triglyceride is consistently selected otherwise. CONCLUSIONS: Our agnostic genetic investigation prioritizes ApoB across all datasets considered, suggesting that ApoB, representing the total number of hepatic-derived lipoprotein particles, is the primary lipid determinant of CAD

    EPISPOT: An epigenome-driven approach for detecting and interpreting hotspots in molecular QTL studies.

    Get PDF
    We present EPISPOT, a fully joint framework which exploits large panels of epigenetic annotations as variant-level information to enhance molecular quantitative trait locus (QTL) mapping. Thanks to a purpose-built Bayesian inferential algorithm, EPISPOT accommodates functional information for both cis and trans actions, including QTL hotspot effects. It effectively couples simultaneous QTL analysis of thousands of genetic variants and molecular traits with hypothesis-free selection of biologically interpretable annotations which directly contribute to the QTL effects. This unified, epigenome-aided learning boosts statistical power and sheds light on the regulatory basis of the uncovered hits; EPISPOT therefore marks an essential step toward improving the challenging detection and functional interpretation of trans-acting genetic variants and hotspots. We illustrate the advantages of EPISPOT in simulations emulating real-data conditions and in a monocyte expression QTL study, which confirms known hotspots and finds other signals, as well as plausible mechanisms of action. In particular, by highlighting the role of monocyte DNase-I sensitivity sites from >150 epigenetic annotations, we clarify the mediation effects and cell-type specificity of major hotspots close to the lysozyme gene. Our approach forgoes the daunting and underpowered task of one-annotation-at-a-time enrichment analyses for prioritizing cis and trans QTL hits and is tailored to any transcriptomic, proteomic, or metabolomic QTL problem. By enabling principled epigenome-driven QTL mapping transcriptome-wide, EPISPOT helps progress toward a better functional understanding of genetic regulation

    A computationally efficient Bayesian seemingly unrelated regressions model for high‐dimensional quantitative trait loci discovery

    Get PDF
    Funder: Victorian Government’s Operational Infrastructure Support ProgramAbstract: Our work is motivated by the search for metabolite quantitative trait loci (QTL) in a cohort of more than 5000 people. There are 158 metabolites measured by NMR spectroscopy in the 31‐year follow‐up of the Northern Finland Birth Cohort 1966 (NFBC66). These metabolites, as with many multivariate phenotypes produced by high‐throughput biomarker technology, exhibit strong correlation structures. Existing approaches for combining such data with genetic variants for multivariate QTL analysis generally ignore phenotypic correlations or make restrictive assumptions about the associations between phenotypes and genetic loci. We present a computationally efficient Bayesian seemingly unrelated regressions model for high‐dimensional data, with cell‐sparse variable selection and sparse graphical structure for covariance selection. Cell sparsity allows different phenotype responses to be associated with different genetic predictors and the graphical structure is used to represent the conditional dependencies between phenotype variables. To achieve feasible computation of the large model space, we exploit a factorisation of the covariance matrix. Applying the model to the NFBC66 data with 9000 directly genotyped single nucleotide polymorphisms, we are able to simultaneously estimate genotype–phenotype associations and the residual dependence structure among the metabolites. The R package BayesSUR with full documentation is available at https://cran.r‐project.org/web/packages/BayesSUR

    MT-HESS: an efficient Bayesian approach for simultaneous association detection in OMICS datasets, with application to eQTL mapping in multiple tissues.

    Get PDF
    MOTIVATION: Analysing the joint association between a large set of responses and predictors is a fundamental statistical task in integrative genomics, exemplified by numerous expression Quantitative Trait Loci (eQTL) studies. Of particular interest are the so-called ': hotspots ': , important genetic variants that regulate the expression of many genes. Recently, attention has focussed on whether eQTLs are common to several tissues, cell-types or, more generally, conditions or whether they are specific to a particular condition. RESULTS: We have implemented MT-HESS, a Bayesian hierarchical model that analyses the association between a large set of predictors, e.g. SNPs, and many responses, e.g. gene expression, in multiple tissues, cells or conditions. Our Bayesian sparse regression algorithm goes beyond ': one-at-a-time ': association tests between SNPs and responses and uses a fully multivariate model search across all linear combinations of SNPs, coupled with a model of the correlation between condition/tissue-specific responses. In addition, we use a hierarchical structure to leverage shared information across different genes, thus improving the detection of hotspots. We show the increase of power resulting from our new approach in an extensive simulation study. Our analysis of two case studies highlights new hotspots that would remain undetected by standard approaches and shows how greater prediction power can be achieved when several tissues are jointly considered. AVAILABILITY AND IMPLEMENTATION: C[Formula: see text] source code and documentation including compilation instructions are available under GNU licence at http://www.mrc-bsu.cam.ac.uk/software/

    Leptin-Mediated Changes in the Human Metabolome.

    Get PDF
    CONTEXT: While severe obesity due to congenital leptin deficiency is rare, studies in patients before and after treatment with leptin can provide unique insights into the role that leptin plays in metabolic and endocrine function. OBJECTIVE: The aim of this study was to characterize changes in peripheral metabolism in people with congenital leptin deficiency undergoing leptin replacement therapy, and to investigate the extent to which these changes are explained by reduced caloric intake. DESIGN: Ultrahigh performance liquid chromatography-tandem mass spectroscopy (UPLC-MS/MS) was used to measure 661 metabolites in 6 severely obese people with congenital leptin deficiency before, and within 1 month after, treatment with recombinant leptin. Data were analyzed using unsupervised and hypothesis-driven computational approaches and compared with data from a study of acute caloric restriction in healthy volunteers. RESULTS: Leptin replacement was associated with class-wide increased levels of fatty acids and acylcarnitines and decreased phospholipids, consistent with enhanced lipolysis and fatty acid oxidation. Primary and secondary bile acids increased after leptin treatment. Comparable changes were observed after acute caloric restriction. Branched-chain amino acids and steroid metabolites decreased after leptin, but not after acute caloric restriction. Individuals with severe obesity due to leptin deficiency and other genetic obesity syndromes shared a metabolomic signature associated with increased BMI. CONCLUSION: Leptin replacement was associated with changes in lipolysis and substrate utilization that were consistent with negative energy balance. However, leptin's effects on branched-chain amino acids and steroid metabolites were independent of reduced caloric intake and require further exploration

    Multi-response Mendelian randomization: Identification of shared and distinct exposures for multimorbidity and multiple related disease outcomes

    Get PDF
    The existing framework of Mendelian randomization (MR) infers the causal effect of one or multiple exposures on one single outcome. It is not designed to jointly model multiple outcomes, as would be necessary to detect causes of more than one outcome and would be relevant to model multimorbidity or other related disease outcomes. Here, we introduce multi-response Mendelian randomization (MR2), an MR method specifically designed for multiple outcomes to identify exposures that cause more than one outcome or, conversely, exposures that exert their effect on distinct responses. MR2 uses a sparse Bayesian Gaussian copula regression framework to detect causal effects while estimating the residual correlation between summary-level outcomes, i.e., the correlation that cannot be explained by the exposures, and vice versa. We show both theoretically and in a comprehensive simulation study how unmeasured shared pleiotropy induces residual correlation between outcomes irrespective of sample overlap. We also reveal how non-genetic factors that affect more than one outcome contribute to their correlation. We demonstrate that by accounting for residual correlation, MR2 has higher power to detect shared exposures causing more than one outcome. It also provides more accurate causal effect estimates than existing methods that ignore the dependence between related responses. Finally, we illustrate how MR2 detects shared and distinct causal exposures for five cardiovascular diseases in two applications considering cardiometabolic and lipidomic exposures and uncovers residual correlation between summary-level outcomes reflecting known relationships between cardiovascular diseases

    Functionally Conserved Noncoding Regulators of Cardiomyocyte Proliferation and Regeneration in Mouse and Human

    Get PDF
    BACKGROUND: The adult mammalian heart has little regenerative capacity after myocardial infarction (MI), whereas neonatal mouse heart regenerates without scarring or dysfunction. However, the underlying pathways are poorly defined. We sought to derive insights into the pathways regulating neonatal development of the mouse heart and cardiac regeneration post-MI. METHODS AND RESULTS: Total RNA-seq of mouse heart through the first 10 days of postnatal life (referred to as P3, P5, P10) revealed a previously unobserved transition in microRNA (miRNA) expression between P3 and P5 associated specifically with altered expression of protein-coding genes on the focal adhesion pathway and cessation of cardiomyocyte cell division. We found profound changes in the coding and noncoding transcriptome after neonatal MI, with evidence of essentially complete healing by P10. Over two-thirds of each of the messenger RNAs, long noncoding RNAs, and miRNAs that were differentially expressed in the post-MI heart were differentially expressed during normal postnatal development, suggesting a common regulatory pathway for normal cardiac development and post-MI cardiac regeneration. We selected exemplars of miRNAs implicated in our data set as regulators of cardiomyocyte proliferation. Several of these showed evidence of a functional influence on mouse cardiomyocyte cell division. In addition, a subset of these miRNAs, miR-144-3p, miR-195a-5p, miR- 451a, and miR-6240 showed evidence of functional conservation in human cardiomyocytes. CONCLUSIONS: The sets of messenger RNAs, miRNAs, and long noncoding RNAs that we report here merit further investigation as gatekeepers of cell division in the postnatal heart and as targets for extension of the period of cardiac regeneration beyond the neonatal period.Leducq Foundation funding via the Transatlantic Network of Excellence (Grant 11CVD01), the British Heart Foundation funding via the Imperial College Centre of Research Excellence and the Imperial Cardiovascular Regenerative Medicine Centre RM/13/1/30157
    corecore