26 research outputs found

    Multiscale Modeling in Systems Biology : Methods and Perspectives

    No full text
    In the last decades, mathematical and computational models have become ubiquitous to the field of systems biology. Specifically, the multiscale nature of biological processes makes the design and simulation of such models challenging. In this thesis we offer a perspective on available methods to study and simulate such models and how they can be combined to handle biological processes evolving at different scales. The contribution of this thesis is threefold. First, we introduce Orchestral, a multiscale modular framework to simulate multicellular models. By decoupling intracellular chemical kinetics, cell-cell signaling, and cellular mechanics by means of operator-splitting, it is able to combine existing software into one massively parallel simulation.  Its modular structure makes it easy to replace its components, e.g. to adjust the level of modeling details. We demonstrate the scalability of our framework on both high performance clusters and in a cloud environment. We then explore how center-based models can be used to study cellular mechanics in biological tissues. We show how modeling and numerical choices can affect the results of the simulation and mislead modelers into incorrect biological conclusions if these errors are not monitored properly. We then propose CBMOS, a Python framework specifically designed for the numerical study of such models. Finally, we study how spatial details in intracellular chemical kinetics can be efficiently approximated in a multiscale compartment-based model. We evaluate how this model compares to two other alternatives in terms of accuracy and computational cost. We then propose a computational pipeline to study and compare such models in the context of Bayesian parameter inference and illustrate its usage in three case studies.https://doi.org/10.33063/diva-442412</p

    CBMOS : a GPU-enabled Python framework for the numerical study of center-based models

    No full text
    Background: Cell-based models are becoming increasingly popular for applications in developmental biology. However, the impact of numerical choices on the accuracy and efficiency of the simulation of these models is rarely meticulously tested. Without concrete studies to differentiate between solid model conclusions and numerical artifacts, modelers are at risk of being misled by their experiments’ results. Most cell-based modeling frameworks offer a feature-rich environment, providing a wide range of biological components, but are less suitable for numerical studies. There is thus a need for software specifically targeted at this use case. Results: We present CBMOS, a Python framework for the simulation of the center-based or cell-centered model. Contrary to other implementations, CBMOS’ focus is on facilitating numerical study of center-based models by providing access to multiple ordinary differential equation solvers and force functions through a flexible, user-friendly interface and by enabling rapid testing through graphics processing unit (GPU) acceleration. We show-case its potential by illustrating two common workflows: (1) comparison of the numerical properties of two solvers within a Jupyter notebook and (2) measuring average wall times of both solvers on a high performance computing cluster. More specifically, we confirm that although for moderate accuracy levels the backward Euler method allows for larger time step sizes than the commonly used forward Euler method, its additional computational cost due to being an implicit method prohibits its use for practical test cases. Conclusions: CBMOS is a flexible, easy-to-use Python implementation of the center-based model, exposing both basic model assumptions and numerical components to the user. It is available on GitHub and PyPI under an MIT license. CBMOS allows for fast prototyping on a central processing unit for small systems through the use of NumPy. Using CuPy on a GPU, cell populations of up to 10,000 cells can be simulated within a few seconds. As such, it will substantially lower the time investment for any modeler to check the crucial assumption that model conclusions are independent of numerical issues

    A multiscale compartment-based model of stochastic gene regulatory networks using hitting-time analysis

    No full text
    Spatial stochastic models of single cell kinetics are capable of capturing both fluctuations in molecular numbers and the spatial dependencies of the key steps of intracellular regulatory networks. The spatial stochastic model can be simulated both on a detailed microscopic level using particle tracking and on a mesoscopic level using the reaction–diffusion master equation. However, despite substantial progress on simulation efficiency for spatial models in the last years, the computational cost quickly becomes prohibitively expensive for tasks that require repeated simulation of thousands or millions of realizations of the model. This limits the use of spatial models in applications such as multicellular simulations, likelihood-free parameter inference, and robustness analysis. Further approximation of the spatial dynamics is needed to accelerate such computational engineering tasks. We here propose a multiscale model where a compartment-based model approximates a detailed spatial stochastic model. The compartment model is constructed via a first-exit time analysis on the spatial model, thus capturing critical spatial aspects of the fine-grained simulations, at a cost close to the simple well-mixed model. We apply the multiscale model to a canonical model of negative-feedback gene regulation, assess its accuracy over a range of parameters, and demonstrate that the approximation can yield substantial speedups for likelihood-free parameter inference.eSSENC

    Systematic comparison of modeling fidelity levels and parameter inference settings applied to negative feedback gene regulation

    No full text
    Quantitative stochastic models of gene regulatory networks are important tools for studying cellular regulation. Such models can be formulated at many different levels of fidelity. A practical challenge is to determine what model fidelity to use in order to get accurate and representative results. The choice is important, because models of successively higher fidelity come at a rapidly increasing computational cost. In some situations, the level of detail is clearly motivated by the question under study. In many situations however, many model options could qualitatively agree with available data, depending on the amount of data and the nature of the observations. Here, an important distinction is whether we are interested in inferring the true (but unknown) physical parameters of the model or if it is sufficient to be able to capture and explain available data. The situation becomes complicated from a computational perspective because inference needs to be approximate. Most often it is based on likelihood-free Approximate Bayesian Computation (ABC) and here determining which summary statistics to use, as well as how much data is needed to reach the desired level of accuracy, are difficult tasks. Ultimately, all of these aspects—the model fidelity, the available data, and the numerical choices for inference—interplay in a complex manner. In this paper we develop a computational pipeline designed to systematically evaluate inference accuracy for a wide range of true known parameters. We then use it to explore inference settings for negative feedback gene regulation. In particular, we compare a detailed spatial stochastic model, a coarse-grained compartment-based multiscale model, and the standard well-mixed model, across several data-scenarios and for multiple numerical options for parameter inference. Practically speaking, this pipeline can be used as a preliminary step to guide modelers prior to gathering experimental data. By training Gaussian processes to approximate the distance function values, we are able to substantially reduce the computational cost of running the pipeline.De två sista författarna delar sistaförfattarskapeteSSENCE - An eScience Collaboratio

    Impact of Force Function Formulations on the Numerical Simulation of Centre-Based Models.

    No full text
    Centre-based or cell-centre models are a framework for the computational study of multicellular systems with widespread use in cancer modelling and computational developmental biology. At the core of these models are the numerical method used to update cell positions and the force functions that encode the pairwise mechanical interactions of cells. For the latter, there are multiple choices that could potentially affect both the biological behaviour captured, and the robustness and efficiency of simulation. For example, available open-source software implementations of centre-based models rely on different force functions for their default behaviour and it is not straightforward for a modeller to know if these are interchangeable. Our study addresses this problem and contributes to the understanding of the potential and limitations of three popular force functions from a numerical perspective. We show empirically that choosing the force parameters such that the relaxation time for two cells after cell division is consistent between different force functions results in good agreement of the population radius of a two-dimensional monolayer relaxing mechanically after intense cell proliferation. Furthermore, we report that numerical stability is not sufficient to prevent unphysical cell trajectories following cell division, and consequently, that too large time steps can cause geometrical differences at the population level.eSSENC

    Systematic comparison of modeling fidelity levels and parameter inference settings applied to negative feedback gene regulation.

    No full text
    Quantitative stochastic models of gene regulatory networks are important tools for studying cellular regulation. Such models can be formulated at many different levels of fidelity. A practical challenge is to determine what model fidelity to use in order to get accurate and representative results. The choice is important, because models of successively higher fidelity come at a rapidly increasing computational cost. In some situations, the level of detail is clearly motivated by the question under study. In many situations however, many model options could qualitatively agree with available data, depending on the amount of data and the nature of the observations. Here, an important distinction is whether we are interested in inferring the true (but unknown) physical parameters of the model or if it is sufficient to be able to capture and explain available data. The situation becomes complicated from a computational perspective because inference needs to be approximate. Most often it is based on likelihood-free Approximate Bayesian Computation (ABC) and here determining which summary statistics to use, as well as how much data is needed to reach the desired level of accuracy, are difficult tasks. Ultimately, all of these aspects-the model fidelity, the available data, and the numerical choices for inference-interplay in a complex manner. In this paper we develop a computational pipeline designed to systematically evaluate inference accuracy for a wide range of true known parameters. We then use it to explore inference settings for negative feedback gene regulation. In particular, we compare a detailed spatial stochastic model, a coarse-grained compartment-based multiscale model, and the standard well-mixed model, across several data-scenarios and for multiple numerical options for parameter inference. Practically speaking, this pipeline can be used as a preliminary step to guide modelers prior to gathering experimental data. By training Gaussian processes to approximate the distance function values, we are able to substantially reduce the computational cost of running the pipeline

    Illustration of the computational experiments performed.

    No full text
    For the different data-scenarios and for each combination of distance metric and amount of data, we execute the pipeline using all 256 synthetic data sets as observed data, and for each of the three models.</p
    corecore