23 research outputs found

    Probabilistic Regression and Anomaly Detection for Latency Assessment in Mobile Radio Networks

    Get PDF
    This thesis provides a thorough examination and empirical results on the use of machine learning for predicting latency in mobile radio networks, specifically emphasizing probabilistic regression and anomaly detection tasks. After a ML-aided selection of the Key Performance Indicators that most influence the latency, different models are compared for both probabilistic regression and anomaly detection. Such models present network designers with a valuable instrument to explore the correlations that exist between particular network Key Performance Indicators and latency

    Charting the landscape of stochastic gene expression models using queueing theory

    Full text link
    Stochastic models of gene expression are typically formulated using the chemical master equation, which can be solved exactly or approximately using a repertoire of analytical methods. Here, we provide a tutorial review of an alternative approach based on queuing theory that has rarely been used in the literature of gene expression. We discuss the interpretation of six types of infinite server queues from the angle of stochastic single-cell biology and provide analytical expressions for the stationary and non-stationary distributions and/or moments of mRNA/protein numbers, and bounds on the Fano factor. This approach may enable the solution of complex models which have hitherto evaded analytical solution.Comment: 24 pages, 6 figure

    A Modelling Framework for Estimating the Risk of Importation of a Novel Disease

    Get PDF
    Sequential Monte Carlo (SMC) methods are vital in fitting models, without a tractable likelihood, to data. When combined with Markov Chain Monte Carlo, SMC allows for full posterior distributions of states and parameters to be estimated. However, for many problems, these methods can be prohibitively computationally expensive. One such class of models with intractable likelihoods are continuous-time Branching Processes (CTBPs). In this thesis, we leverage the unique properties of CTBPs to derive a method that approximates the results of standard SMC methods, with a significant reduction in computation time. We find that under certain conditions the method we have developed can produce highly accurate results in orders of magnitude less time than standard SMC methods. Continuous-time Branching Processes are often used for epidemic modelling, particularly in the early phases of an outbreak. In light of the COVID-19 pandemic, CTBPs have been used in metapopulation models, where agents are partitioned into subpopulations (usually states or countries) that interact through immigration. In this thesis, we build upon existing work in this area, with a focus on estimating disease importation risk. We show how applying our method to this problem can allow for joint estimation of the parameters mediating disease spread and unobserved cases. Specifically, the speed improvement given by our method allows for full posterior distributions for states, parameters and importation risk to be derived. Furthermore, we find that the increase in speed also allows more parameters to be estimated. Consequently, each subpopulation can have its own parameters. As a result, hierarchical modelling can be employed, meaning that parameter estimates from one subpopulation can inform the estimates of others. We find hierarchical modelling to be vital in estimating importation risk, particularly for counties with low observation probability.Thesis (MPhil) -- University of Adelaide, School of Computer and Mathematical Sciences, 202

    Numerical methods and hypoexponential approximations for gamma distributed delay differential equations

    Get PDF
    Gamma distributed delay differential equations (DDEs) arise naturally in many modelling applications. However, appropriate numerical methods for generic gamma distributed DDEs have not previously been implemented. Modellers have therefore resorted to approximating the gamma distribution with an Erlang distribution and using the linear chain technique to derive an equivalent system of ordinary differential equations (ODEs). In this work, we address the lack of appropriate numerical tools for gamma distributed DDEs in two ways. First, we develop a functional continuous Runge–Kutta (FCRK) method to numerically integrate the gamma distributed DDE without resorting to Erlang approximation. We prove the fourth-order convergence of the FCRK method and perform numerical tests to demonstrate the accuracy of the new numerical method. Nevertheless, FCRK methods for infinite delay DDEs are not widely available in existing scientific software packages. As an alternative approach to solving gamma distributed DDEs, we also derive a hypoexponential approximation of the gamma distributed DDE. This hypoexponential approach is a more accurate approximation of the true gamma distributed DDE than the common Erlang approximation but, like the Erlang approximation, can be formulated as a system of ODEs and solved numerically using standard ODE software. Using our FCRK method to provide reference solutions, we show that the common Erlang approximation may produce solutions that are qualitatively different from the underlying gamma distributed DDE. However, the proposed hypoexponential approximations do not have this limitation. Finally, we apply our hypoexponential approximations to perform statistical inference on synthetic epidemiological data to illustrate the utility of the hypoexponential approximation

    Mathematical Modelling and Nonstandard Schemes for the Corona Virus Pandemic

    Get PDF
    The programs used in this Master thesis: https://git.uni-wuppertal.de/1449563/covid-19-modelling/-/tree/master/PROGRAM

    Stochastic multiscale models of cell behaviour

    Get PDF

    Rate-limiting Steps in Transcription Initiation are Key Regulatory Mechanisms of Escherichia coli Gene Expression Dynamics

    Get PDF
    In all living organisms, the “blueprints of life” are documented in the genetic material. This material is composed of genes, which are regions of DNA coding for proteins. To produce proteins, cells read the information on the DNA with the help of molecular machines, such as RNAp holoenzymes and a factors. Proteins carry out the cellular functions required for survival and, as such, cells deal with challenging environments by adjusting their gene expression pattern. For this, cells constantly perform decision- making processes of whether or not to actively express a protein, based on intracellular and environmental cues. In Escherichia coli, gene expression is mostly regulated at the stage of transcription initiation. Although most of its regulatory molecules have been identified, the dynamics and regulation of this step remain elusive. Due to a limited number of specific regulatory molecules in the cells, the stochastic fluctuations of these molecular numbers can result in a sizeable temporal change in the numbers of transcription outputs (RNA and proteins) and have consequences on the phenotype of the cells. To understand the dynamics of this process, one should study the activity of the gene by tracking mRNA and protein production events at a detailed level. Recent advancements in single-molecule detection techniques have been used to image and track individually labeled fluorescent macromolecules of living cells. This allows investigating the intermolecular dynamics under any given condition. In this thesis, by using in vivo, single-RNA time-lapse microscopy techniques along with stochastic modelling techniques, we studied the kinetics of multi-rate limiting steps in the transcription process of multiple promoters, in various conditions. Specifically, first, we established a novel method of dissecting transcription in Escherichia coli that combines state-of-the-art microscopy measurements and model fitting techniques to construct detailed models of the rate-limiting steps governing the in vivo transcription initiation of a synthetic Lac-ara-1 promoter. After that, we estimated the duration of the closed and open complex formation, accounting for the rate of reversibility of the first step. From this, we also estimated the duration of periods of promoter inactivity, from which we were able to determine the contribution from each step to the distribution of intervals between consecutive RNA productions in individual cells. Second, using the above method, we studied the a factor selective mechanisms for indirect regulation of promoters whose transcription is primarily initiated by RNAp holoenzymes carrying a70. From the analysis, we concluded that, in E. coli, a promoter’s responsiveness to indirect regulation by a factor competition is determined by its sequence-dependent, dynamically regulated multi-step initiation kinetics. Third, we investigated the effects of extrinsic noise, arising from cell-to-cell variability in cellular components, on the single-cell distribution of RNA numbers, in the context of cell lineages. For this, first, we used stochastic models to predict the variability in the numbers of molecules involved in upstream processes. The models account for the intake of inducers from the environment, which acts as a transient source of variability in RNA production numbers, as well as for the variability in the numbers of molecular species controlling transcription of an active promoter, which acts as a constant source of variability in RNA numbers. From measurement analysis, we demonstrated the existence of lineage-to-lineage variability in gene activation times and mean transcription rates. Finally, we provided evidence that this can be explained by differences in the kinetics of the rate-limiting steps in transcription and of the induction scheme, from which it is possible to conclude that these variabilities differ between promoters and inducers used. Finally, we studied how the multi-rate limiting steps in the transcription initiation are capable of tuning the asymmetry and tailedness of the distribution of time intervals between consecutive RNA production events in individual cells. For this, first, we considered a stochastic model of transcription initiation and predicted that the asymmetry and tailedness in the distribution of intervals between consecutive RNA production events can differ by tuning the rate-limiting steps in transcription. Second, we validated the model with measurements from single-molecule RNA microscopy of transcription kinetics of multiple promoters in multiple conditions. Finally, from our results, we concluded that the skewness and kurtosis in RNA and protein production kinetics are subject to regulation by the kinetics of the steps in transcription initiation and affect the single-cell distributions of RNAs and, thus, proteins. We further showed that this regulation can significantly affect the probability of RNA and protein numbers to cross specific thresholds. Overall, the studies conducted in this thesis are expected to contribute to a better understanding of the dynamic process of bacterial gene expression. The advanced data and image analysis techniques and novel stochastic modeling approaches that we developed during the course of these studies, will allow studying in detail the in vivo regulation of multi-rate limiting steps of transcription initiation of any given promoter. In addition, by tuning the kinetics of the rate-limiting steps in the transcription initiation as executed here should allow engineering new promoters, with predefined RNA and, thus, protein production dynamics in Escherichia coli

    Maximum likelihood estimation of species trees and anomaly zone detection using ranked gene trees

    Get PDF
    A phylogenetic tree represents the evolutionary relationships among a set of organisms. Gene trees can be used to reconstruct phylogenetic trees. The methods in this dissertation focus on the gene tree topologies with emphasis on ranked gene tree topologies. A ranked tree depicts the order in which nodes appear in the tree together with topological relationships among gene lineages. One challenge that arises during phylogenetic inference is the existence of the anomaly zones, the regions of branch-length space in the species tree that can produce gene trees that have topologies differing from the species tree topology but are more probable than the gene tree matching the species tree. In this work, we show how the parameters of a constant-rate birth-death process used to simulate species trees affect the probability that the species tree lies in the anomaly zone. We prove that the probability that a species tree is in an anomaly zone approaches 1 as the number of species and the birth rate go to infinity in a pure birth process. We propose a heuristic approach to infer whether species trees lie in the different types of anomaly zones trees when it is intractable to compute the entire distribution of gene tree topologies. In this dissertation, we develop the first maximum likelihood (ML) method that infers a species tree from the ranked gene trees. We introduce the software PRANC, which can compute the probabilities of ranked gene trees under the coalescent process and infer an ML species tree. We propose methods to estimate a starting tree to be able to locate the ML species tree quickly. To illustrate the methods proposed, we analyze two experimental studies of skinks and gibbons
    corecore