81 research outputs found

    Neural Continuous-Time Markov Models

    Full text link
    Continuous-time Markov chains are used to model stochastic systems where transitions can occur at irregular times, e.g., birth-death processes, chemical reaction networks, population dynamics, and gene regulatory networks. We develop a method to learn a continuous-time Markov chain's transition rate functions from fully observed time series. In contrast with existing methods, our method allows for transition rates to depend nonlinearly on both state variables and external covariates. The Gillespie algorithm is used to generate trajectories of stochastic systems where propensity functions (reaction rates) are known. Our method can be viewed as the inverse: given trajectories of a stochastic reaction network, we generate estimates of the propensity functions. While previous methods used linear or log-linear methods to link transition rates to covariates, we use neural networks, increasing the capacity and potential accuracy of learned models. In the chemical context, this enables the method to learn propensity functions from non-mass-action kinetics. We test our method with synthetic data generated from a variety of systems with known transition rates. We show that our method learns these transition rates with considerably more accuracy than log-linear methods, in terms of mean absolute error between ground truth and predicted transition rates. We also demonstrate an application of our methods to open-loop control of a continuous-time Markov chain.Comment: 8 pages, 6 figure

    Towards efficient siRNA delivery and gene silencing kinetics on the single cell level

    Get PDF
    RNA interference (RNAi) is a natural sequence-specific mechanism of post-transcriptional gene regulation leaded by short, double stranded RNA fragments e.g. small interfering RNAs (siRNA). Despite its high therapeutic potential, the safe and efficient systemic delivery of siRNAs into a large number of diseased cells to trigger therapeutic gene knockdown remains challenging. Moreover, novel quantitative methods for assessing activity of siRNA-based therapeutic agents in a fast and precise manner are needed. In this work, we firstly developed the folate-targeted monomolecular nucleic acid/lipid particles (FolA-mNALPs) formed using microfluidic-based method and studied their functionality regarding prospective use as a siRNA delivery agent. Secondly, we quantify the single-cell kinetics of siRNA-mediated gene silencing using micro-patterned cell cultivation substrates combined with time-lapse fluorescence microscopy (life-cell imaging on single-cell arrays, LISCA). In particular, we demonstrate that microfluidic self-assembly combined with rational design of lipid formulation results in nanoparticles of small size and narrow size distribution that in average contain single siRNA molecule covered with a single lipid bilayer (mNALP). We investigate the stability of folate-functionalized mNALPs in biological fluids, and their biological performance in terms of cellular internalisation and silencing efficiency. Small sizes, efficient targeting and presented silencing capability following facilitated endosomal release make mNALP a promising system for the future development of an in vivo siRNA delivery agent. Furthermore, using LISCA we investigate the magnitude of siRNA-induced mRNA degradation. By mathematical modelling of gene expression and fitting of expression time-courses we obtain the population distributions of rate constants related with the model, including single-cell mRNA degradation rate constants. The expression time-courses are gained by monitoring the dynamic changes in single-cell fluorescence intensities of reporter proteins (eGFP target and CayRFP reference). Obtained kinetic parameters allow us to quantify the silencing efficiency as the relative fold-change in mRNA degradation rate constants, to identify the subpopulations of cells affected by siRNA activity and, by analysis of correlations between kinetic parameters of CayRFP and eGFP expression, to infer on the properties of mRNA delivery and expression kinetics. Presented approach allows for the precise quantification of the activity of siRNA-based therapeutics in an accurate and fast (<30h) manner based on the analysis of time-independent kinetic parameters describing the silencing process.RNA-Interferenz (RNAi) ist ein natürlicher Mechanismus der posttranskriptionalen Genregulation in eukaryotischen Zellen. RNAi kann spezifische Gene ansteuern und bietet hohe Flexilitiät in der Wahl der angesteuerten mRNA Sequenzregionen.Diese beiden Charakteristika machen RNAi zu einem vielseitigen Werkzeug bei der Untersuchung von Genfunktionen und zu einem möglichen Therapeutikum für eine große Vielfalt an Erkrankungen. Im Rahmen dieser Arbeit wurden a) eine mikrofluidik-basierte Methode zur verbesserten Selbst-Assemblierung von monomolekularen Nukleinsäure/ -lipidteilchen (mNALPs) für ihren möglichen zukünftigen Nutzen als siRNA-Lieferant entwickelt und b) die Einzelzellantworten auf siRNA-induzierte Genstilllegung untersucht. Wir bestimmen insbesondere die optimalen Parameter für die Selbst-Assemblierung von mNALPs, untersuchen deren Stabilität in biologischen Flüssigkeiten und ihre Wirkungsweise bezüglich zelltypspezifischer Internalisierung und Stilllegungseffizienz in in-vitro Zellexperimenten. Des Weiteren verwenden wir Lebendzell-Videomikroskopie auf mikrostrukturierten Substraten („live-cell imaging on single cell arrys“, LISCA) um die, durch siRNA-Aktivität induzierte, relative Veränderung der mRNA-Degradierungsratenkonstanten zu untersuchen.Eine Aussage über die Stärke der siRNA-induzierten mRNA Degradierung kann durch das mathematische Modell der Genexpression und das Fitten der Fluoreszenz-Zeitkurven getroffen werden, die aus den dynamischen Veränderungen in der Einzelzellfluoreszenzintensitäten der Reporterproteine gewonnen wird. Diese Prozedur liefert die Populationsverteilung von Ratenkonstanten, welche mit dem Modell verbunden sind. Dadurch können wir die Effizienz der Gen-Stilllegung als relative Veränderung der mRNA-Degradationsratenkonstanten quantifizieren und zusätzlich Subpopulationen von Zellen identifizieren, welche von der siRNA-Aktivität nicht betroffen sind. Zudem kann die Analyse der Korrelationen zwischen den kinetischen Parametern der CayRFP- und eGFP-Expressionen einen Rückschluss auf die Eigenschaften der mRNA-Lieferung der Expressionskinetik erlauben. Die nanoskalige Größe, Stabilität, spezifisches Targeting und die demonstrierte spezifische Stillegung eines Gens, machen mNALP zu einer vielversprechenden Grundlage für ein zukünftiges in-vivo siRNA-Transfersystems. Zudem stellen wir die mikroskopiebasierte Methode LISCA vor, welche eine präzise Quantifizierung der Aktivität von siRNA-basierten Therapeutika erlaubt. Auf akkurate und schnelle Weise (< 30h) können damit zeitabhängige kinetische Parameter, welche den Stillegungsprozess von Genen beschreiben, gewonnen werden

    Learning differential equation models from stochastic agent-based model simulations

    Full text link
    Agent-based models provide a flexible framework that is frequently used for modelling many biological systems, including cell migration, molecular dynamics, ecology, and epidemiology. Analysis of the model dynamics can be challenging due to their inherent stochasticity and heavy computational requirements. Common approaches to the analysis of agent-based models include extensive Monte Carlo simulation of the model or the derivation of coarse-grained differential equation models to predict the expected or averaged output from the agent-based model. Both of these approaches have limitations, however, as extensive computation of complex agent-based models may be infeasible, and coarse-grained differential equation models can fail to accurately describe model dynamics in certain parameter regimes. We propose that methods from the equation learning field provide a promising, novel, and unifying approach for agent-based model analysis. Equation learning is a recent field of research from data science that aims to infer differential equation models directly from data. We use this tutorial to review how methods from equation learning can be used to learn differential equation models from agent-based model simulations. We demonstrate that this framework is easy to use, requires few model simulations, and accurately predicts model dynamics in parameter regions where coarse-grained differential equation models fail to do so. We highlight these advantages through several case studies involving two agent-based models that are broadly applicable to biological phenomena: a birth-death-migration model commonly used to explore cell biology experiments and a susceptible-infected-recovered model of infectious disease spread

    Stochastic spatial modelling of DNA methylation patterns and moment-based parameter estimation

    Get PDF
    In the first part of this thesis, we introduce and analyze spatial stochastic models for DNA methylation, an epigenetic mark with an important role in development. The underlying mechanisms controlling methylation are only partly understood. Several mechanistic models of enzyme activities responsible for methylation have been proposed. Here, we extend existing hidden Markov models (HMMs) for DNA methylation by describing the occurrence of spatial methylation patterns with stochastic automata networks. We perform numerical analysis of the HMMs applied to (non-)hairpin bisulfite sequencing KO data and accurately predict the wild-type data from these results. We find evidence that the activities of Dnmt3a/b responsible for de novo methylation depend on the left but not on the right CpG neighbors. The second part focuses on parameter estimation in chemical reaction networks (CRNs). We propose a generalized method of moments (GMM) approach for inferring the parameters of CRNs based on a sophisticated matching of the statistical moments of the stochastic model and the sample moments of population snapshot data. The proposed parameter estimation method exploits recently developed moment-based approximations and provides estimators with desirable statistical properties when many samples are available. The GMM provides accurate and fast estimations of unknown parameters of CRNs. The accuracy increases and the variance decreases when higher-order moments are considered.Im ersten Teil der Arbeit führen wir eine Analyse für spatielle stochastische Modelle der DNA Methylierung, ein wichtiger epigenetischer Marker in der Entwicklung, durch. Die zugrunde liegenden Mechanismen der Methylierung werden noch nicht vollständig verstanden. Mechanistische Modelle beschreiben die Aktivität der Methylierungsenzyme. Wir erweitern bestehende Hidden Markov Models (HMMs) zur DNA Methylierung durch eine Stochastic Automata Networks Beschreibung von spatiellen Methylierungsmustern. Wir führen eine numerische Analyse der HMMs auf bisulfit-sequenzierten KO Datens¨atzen aus und nutzen die Resultate, um die Wildtyp-Daten erfolgreich vorherzusagen. Unsere Ergebnisse deuten an, dass die Aktivitäten von Dnmt3a/b, die überwiegend für die de novo Methylierung verantwortlich sind, nur vom Methylierungsstatus des linken, nicht aber vom rechten CpG Nachbarn abhängen. Der zweite Teil befasst sich mit Parameterschätzung in chemischen Reaktionsnetzwerken (CRNs). Wir führen eine Verallgemeinerte Momentenmethode (GMM) ein, die die statistischen Momente des stochastischen Modells an die Momente von Stichproben geschickt anpasst. Die GMM nutzt hier kürzlich entwickelte, momentenbasierte Näherungen, liefert Schätzer mit wünschenswerten statistischen Eigenschaften, wenn genügend Stichproben verfügbar sind, mit schnellen und genauen Schätzungen der unbekannten Parameter in CRNs. Momente höherer Ordnung steigern die Genauigkeit des Schätzers, während die Varianz sinkt

    Numerical Methods for the Chemical Master Equation

    Get PDF
    The dynamics of biochemical networks can be described by a Markov jump process on a high-dimensional state space, with the corresponding probability distribution being the solution of the Chemical Master Equation (CME). In this thesis, adaptive wavelet methods for the time-dependent and stationary CME, as well as for the approximation of committor probabilities are devised. The methods are illustrated on multi-dimensional models with metastable solutions and large state spaces

    Design, Analysis, And Computational Methods For Engineering Synthetic Biological Networks

    Get PDF
    This thesis advances our understanding of three important aspects of biological systems engineering: analysis, design, and computational methods. First, biological circuit design is necessary to engineer biological systems that behave consistently and follow our design specifications. We contribute by formulating and solving novel problems in stochastic biological circuit design. Second, computational methods for solving biological systems are often limited by the nonlinearity and high dimensionality of the system’s dynamics. This problem is particularly extreme for the parameter identification of stochastic, nonlinear systems. Thus, we develop a method for parameter identification that relies on data-driven stochastic model reduction. Finally, biological system analysis encompasses understanding the stability, performance, and robustness of these systems, which is critical for their implementation. We analyze a sequestration feedback motif for implementing biological control. First, we discuss biological circuit design for the stationary and the transient distributional responses of stochastic biochemical systems. Noise is often indispensable to key cellular activities, such as gene expression, necessitating the use of stochastic models to capture their dynamics. The chemical master equation is a commonly used stochastic model that describes how the probability distribution of a chemically reacting system varies with time. Here we design the distributional response of these stochastic models by formulating and solving it as a constrained optimization problem. Second, we analyze the stability and the performance of a biological controller implemented by a sequestration feedback network motif. Sequestration feedback networks have been implemented in synthetic biology using an array of biological parts. However, their properties of stability and performance are poorly understood. We provide insight into the stability and performance of sequestration feedback networks. Additionally, we provide guidelines for the implementation of sequestration feedback networks. Third, we develop computational methods for the parameter identification of stochastic models of biochemical reaction networks. It is often not possible to find analytic solutions to problems where the dynamics of the underlying biological circuit are stochastic, nonlinear or both. Stochastic models are often challenging due to their high dimensionality and their nonlinearity, which further limits the availability of analytical tools. To address these challenges, we develop a computational method for data-driven stochastic model reduction and we use it to perform parameter identification. Last, we provide concluding remarks and future research directions.</p

    Integrating discrete stochastic models with single-cell and single-molecule experiments

    Get PDF
    2019 Summer.Includes bibliographical references.Modern biological experiments can capture the behaviors of single biomolecules within single cells. Much like Robert Brown looking at pollen grains in water, experimentalists have noticed that individual cells that are genetically identical behave seemingly randomly in the way they carry out their most basic functions. The field of stochastic single-cell biology has been focused developing mathematical and computational tools to understand how cells try to buffer or even make use of such fluctuations, and the technologies to measure such fluctuations has vastly improved in recent years. This dissertation is focused on developing new methods to analyze modern single-cell and single-molecule biological data with discrete stochastic models of the underlying processes, such as stochastic gene expression and single-mRNA translation. The methods developed here emphasize a strong link between model and experiment to help understand, design, and eventually control biological systems at the single-cell level

    Towards efficient siRNA delivery and gene silencing kinetics on the single cell level

    Get PDF
    RNA interference (RNAi) is a natural sequence-specific mechanism of post-transcriptional gene regulation leaded by short, double stranded RNA fragments e.g. small interfering RNAs (siRNA). Despite its high therapeutic potential, the safe and efficient systemic delivery of siRNAs into a large number of diseased cells to trigger therapeutic gene knockdown remains challenging. Moreover, novel quantitative methods for assessing activity of siRNA-based therapeutic agents in a fast and precise manner are needed. In this work, we firstly developed the folate-targeted monomolecular nucleic acid/lipid particles (FolA-mNALPs) formed using microfluidic-based method and studied their functionality regarding prospective use as a siRNA delivery agent. Secondly, we quantify the single-cell kinetics of siRNA-mediated gene silencing using micro-patterned cell cultivation substrates combined with time-lapse fluorescence microscopy (life-cell imaging on single-cell arrays, LISCA). In particular, we demonstrate that microfluidic self-assembly combined with rational design of lipid formulation results in nanoparticles of small size and narrow size distribution that in average contain single siRNA molecule covered with a single lipid bilayer (mNALP). We investigate the stability of folate-functionalized mNALPs in biological fluids, and their biological performance in terms of cellular internalisation and silencing efficiency. Small sizes, efficient targeting and presented silencing capability following facilitated endosomal release make mNALP a promising system for the future development of an in vivo siRNA delivery agent. Furthermore, using LISCA we investigate the magnitude of siRNA-induced mRNA degradation. By mathematical modelling of gene expression and fitting of expression time-courses we obtain the population distributions of rate constants related with the model, including single-cell mRNA degradation rate constants. The expression time-courses are gained by monitoring the dynamic changes in single-cell fluorescence intensities of reporter proteins (eGFP target and CayRFP reference). Obtained kinetic parameters allow us to quantify the silencing efficiency as the relative fold-change in mRNA degradation rate constants, to identify the subpopulations of cells affected by siRNA activity and, by analysis of correlations between kinetic parameters of CayRFP and eGFP expression, to infer on the properties of mRNA delivery and expression kinetics. Presented approach allows for the precise quantification of the activity of siRNA-based therapeutics in an accurate and fast (<30h) manner based on the analysis of time-independent kinetic parameters describing the silencing process.RNA-Interferenz (RNAi) ist ein natürlicher Mechanismus der posttranskriptionalen Genregulation in eukaryotischen Zellen. RNAi kann spezifische Gene ansteuern und bietet hohe Flexilitiät in der Wahl der angesteuerten mRNA Sequenzregionen.Diese beiden Charakteristika machen RNAi zu einem vielseitigen Werkzeug bei der Untersuchung von Genfunktionen und zu einem möglichen Therapeutikum für eine große Vielfalt an Erkrankungen. Im Rahmen dieser Arbeit wurden a) eine mikrofluidik-basierte Methode zur verbesserten Selbst-Assemblierung von monomolekularen Nukleinsäure/ -lipidteilchen (mNALPs) für ihren möglichen zukünftigen Nutzen als siRNA-Lieferant entwickelt und b) die Einzelzellantworten auf siRNA-induzierte Genstilllegung untersucht. Wir bestimmen insbesondere die optimalen Parameter für die Selbst-Assemblierung von mNALPs, untersuchen deren Stabilität in biologischen Flüssigkeiten und ihre Wirkungsweise bezüglich zelltypspezifischer Internalisierung und Stilllegungseffizienz in in-vitro Zellexperimenten. Des Weiteren verwenden wir Lebendzell-Videomikroskopie auf mikrostrukturierten Substraten („live-cell imaging on single cell arrys“, LISCA) um die, durch siRNA-Aktivität induzierte, relative Veränderung der mRNA-Degradierungsratenkonstanten zu untersuchen.Eine Aussage über die Stärke der siRNA-induzierten mRNA Degradierung kann durch das mathematische Modell der Genexpression und das Fitten der Fluoreszenz-Zeitkurven getroffen werden, die aus den dynamischen Veränderungen in der Einzelzellfluoreszenzintensitäten der Reporterproteine gewonnen wird. Diese Prozedur liefert die Populationsverteilung von Ratenkonstanten, welche mit dem Modell verbunden sind. Dadurch können wir die Effizienz der Gen-Stilllegung als relative Veränderung der mRNA-Degradationsratenkonstanten quantifizieren und zusätzlich Subpopulationen von Zellen identifizieren, welche von der siRNA-Aktivität nicht betroffen sind. Zudem kann die Analyse der Korrelationen zwischen den kinetischen Parametern der CayRFP- und eGFP-Expressionen einen Rückschluss auf die Eigenschaften der mRNA-Lieferung der Expressionskinetik erlauben. Die nanoskalige Größe, Stabilität, spezifisches Targeting und die demonstrierte spezifische Stillegung eines Gens, machen mNALP zu einer vielversprechenden Grundlage für ein zukünftiges in-vivo siRNA-Transfersystems. Zudem stellen wir die mikroskopiebasierte Methode LISCA vor, welche eine präzise Quantifizierung der Aktivität von siRNA-basierten Therapeutika erlaubt. Auf akkurate und schnelle Weise (< 30h) können damit zeitabhängige kinetische Parameter, welche den Stillegungsprozess von Genen beschreiben, gewonnen werden

    Computational Tools for Large-Scale Linear Systems

    Get PDF
    While the theoretical analysis of linear dynamical systems with finite state-spaces is a mature topic, in situations where the underlying model has a large number of dimensions, modelers must turn to computational tools to better visualize and analyze the dynamic behavior of interest. In these situations, we are confronted with the Curse of Dimensionality: computational and storage complexity grows exponentially in the number of dimensions. This doctoral project focuses on two main classes of large-scale linear systems which arise in system biology. The Chemical Master Equation (CME) is a Fokker-Planck equation which describes the evolution of the probability mass function of a countable state space Markov process. Each state of the CME is labelled with an ordered S-tuple corresponding to one configuration of a well-mixed chemical system, where S is the number of distinct chemical species of interest. Even in cases where one only considers a projection of the CME to a finite subset of the states, one still must contend with the Curse of Dimensionality: the computational complexity grows exponentially in the number of chemical species. This dissertation describes a computational methodology for efficient solution of the CME which, in the best cases, will scale linearly in the number of chemical species. The second main class of high-dimensional problems requiring computational tools are coupled linear reaction-diffusion equations. For this class of models, we focus primarily on the computation of certain high-dimensional matrices which describe in a quantitative sense the input-to-state and state-to-output relationships. We describe algorithms for extracting useful information stored in these matrices and use this information to efficiently compute both reduced order models and open-loop control laws for steering the full system. A key feature of this approach is that the method is completely simulation or experiment free, in fact, in our numerical experiments, the computation of a reduced model or open-loop control law is an order of magnitude faster on a laptop than simulation of the full system on a 32 core node of a high-performance cluster. In both projects, the enabling computational technology is the recently proposed Tensor Train (TT) structured low-parametric representation of high-dimensional data. The TT-format effectively exploits low-rank structure of the "unfolding matrices" for compression and computational efficiency. Formally, the computational complexity of basic TT arithmetics scale linearly in the number of dimensions, potentially circumventing the curse of dimensionality. To demonstrate the effectiveness of this approach, we performed numerous numerical experiments whose results are reported here
    corecore