28 research outputs found

    A framework for parameter estimation and model selection from experimental data in systems biology using approximate Bayesian computation.

    Get PDF
    As modeling becomes a more widespread practice in the life sciences and biomedical sciences, researchers need reliable tools to calibrate models against ever more complex and detailed data. Here we present an approximate Bayesian computation (ABC) framework and software environment, ABC-SysBio, which is a Python package that runs on Linux and Mac OS X systems and that enables parameter estimation and model selection in the Bayesian formalism by using sequential Monte Carlo (SMC) approaches. We outline the underlying rationale, discuss the computational and practical issues and provide detailed guidance as to how the important tasks of parameter inference and model selection can be performed in practice. Unlike other available packages, ABC-SysBio is highly suited for investigating, in particular, the challenging problem of fitting stochastic models to data. In order to demonstrate the use of ABC-SysBio, in this protocol we postulate the existence of an imaginary reaction network composed of seven interrelated biological reactions (involving a specific mRNA, the protein it encodes and a post-translationally modified version of the protein), a network that is defined by two files containing 'observed' data that we provide as supplementary information. In the first part of the PROCEDURE, ABC-SysBio is used to infer the parameters of this system, whereas in the second part we use ABC-SysBio's relevant functionality to discriminate between two different reaction network models, one of them being the 'true' one. Although computationally expensive, the additional insights gained in the Bayesian formalism more than make up for this cost, especially in complex problems

    Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis.

    Get PDF
    BACKGROUND: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. RESULTS: To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N = 1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3-5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. CONCLUSIONS: Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk

    A saturated map of common genetic variants associated with human height

    Get PDF
    Common single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes(1). Here, using data from a genome-wide association study of 5.4 million individuals of diverse ancestries, we show that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a mean size of around 90 kb, covering about 21% of the genome. The density of independent associations varies across the genome and the regions of increased density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs (or all SNPs in the HapMap 3 panel(2)) account for 40% (45%) of phenotypic variance in populations of European ancestry but only around 10-20% (14-24%) in populations of other ancestries. Effect sizes, associated regions and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely to be explained by linkage disequilibrium and differences in allele frequency within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than are needed to implicate causal genes and variants. Overall, this study provides a comprehensive map of specific genomic regions that contain the vast majority of common height-associated variants. Although this map is saturated for populations of European ancestry, further research is needed to achieve equivalent saturation in other ancestries.A large genome-wide association study of more than 5 million individuals reveals that 12,111 single-nucleotide polymorphisms account for nearly all the heritability of height attributable to common genetic variants

    A saturated map of common genetic variants associated with human height.

    Get PDF
    Common single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes1. Here, using data from a genome-wide association study of 5.4 million individuals of diverse ancestries, we show that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a mean size of around 90 kb, covering about 21% of the genome. The density of independent associations varies across the genome and the regions of increased density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs (or all SNPs in the HapMap 3 panel2) account for 40% (45%) of phenotypic variance in populations of European ancestry but only around 10-20% (14-24%) in populations of other ancestries. Effect sizes, associated regions and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely to be explained by linkage disequilibrium and differences in allele frequency within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than are needed to implicate causal genes and variants. Overall, this study provides a comprehensive map of specific genomic regions that contain the vast majority of common height-associated variants. Although this map is saturated for populations of European ancestry, further research is needed to achieve equivalent saturation in other ancestries

    Delayed acceptance particle MCMC for exact inference in stochastic kinetic models

    No full text
    Recently-proposed particle MCMC methods provide a flexible way of performing Bayesian inference for parameters governing stochastic kinetic models defined as Markov (jump) processes (MJPs). Each iteration of the scheme requires an estimate of the marginal likelihood calculated from the output of a sequential Monte Carlo scheme (also known as a particle filter). Consequently, the method can be extremely computationally intensive. We therefore aim to avoid most instances of the expensive likelihood calculation through use of a fast approximation. We consider two approximations: the chemical Langevin equation diffusion approximation (CLE) and the linear noise approximation (LNA). Either an estimate of the marginal likelihood under the CLE, or the tractable marginal likelihood under the LNA can be used to calculate a first step acceptance probability. Only if a proposal is accepted under the approximation do we then run a sequential Monte Carlo scheme to compute an estimate of the marginal likelihood under the true MJP and construct a second stage acceptance probability that permits exact (simulation based) inference for the MJP. We therefore avoid expensive calculations for proposals that are likely to be rejected. We illustrate the method by considering inference for parameters governing a Lotka-Volterra system, a model of gene expression and a simple epidemic process.Comment: Statistics and Computing (to appear

    Efficient particle MCMC for exact inference in stochastic biochemical network models through approximation of expensive likelihoods

    No full text
    Recently-proposed particle MCMC methods provide a flexible way of performing Bayesian inference for parameters governing stochastic kinetic models defined as Markov (jump) processes (MJPs). Each iteration of the scheme requires an estimate of the marginal likelihood calculated from the output of a sequential Monte Carlo scheme (also known as a particle filter). Consequently, the method can be extremely computationally intensive. We therefore aim to avoid most instances of the expensive likelihood calculation through use of a fast approximation. We consider two approximations: the chemical Langevin equation diffusion approximation (CLE) and the linear noise approximation (LNA). Either an estimate of the marginal likelihood under the CLE, or the tractable marginal likelihood under the LNA can be used to calculate a first step acceptance probability. Only if a proposal is accepted under the approximation do we then run a sequential Monte Carlo scheme to compute an estimate of the marginal likelihood under the true MJP and construct a second stage acceptance probability that permits exact (simulation based) inference for the MJP. We therefore avoid expensive calculations for proposals that are likely to be rejected. We illustrate the method by considering inference for parameters governing a Lotka–Volterra system, a model of gene expression and a simple epidemic process

    The coordination between train traffic controllers and train drivers : a distributed cognition perspective on railway

    No full text
    Although there has long been a call for a holistic systems perspective to better understand real work in the complex domain of railway traffic, prior research has not strongly emphasised the socio-technical perspective. In operational railway traffic, the successful planning and execution of the traffic are the product of the socio-technical system comprised by both train drivers and traffic controllers. This paper presents a study inspired by cognitive ethnography with the aim to characterise the coordinating activities that are conducted by train traffic controllers and train drivers in the work practices of the socio-technical system of Swedish railway. The theoretical framework of distributed cognition (DCog) is used as a conceptual and analytical tool to make sense of the complex railway domain and the best practices as they are developed and performed “in the wild”. The analysis reveals a pattern of collaboration and coordination of actions among the workers and we introduce the concept of enacted actionable practices as a key concern for understanding how a successfully executed railway traffic emerges as a property of the socio-technical system. The implications for future railway research are briefly discussed
    corecore