1,473 research outputs found

    The Reasonable Effectiveness of Randomness in Scalable and Integrative Gene Regulatory Network Inference and Beyond

    Get PDF
    Gene regulation is orchestrated by a vast number of molecules, including transcription factors and co-factors, chromatin regulators, as well as epigenetic mechanisms, and it has been shown that transcriptional misregulation, e.g., caused by mutations in regulatory sequences, is responsible for a plethora of diseases, including cancer, developmental or neurological disorders. As a consequence, decoding the architecture of gene regulatory networks has become one of the most important tasks in modern (computational) biology. However, to advance our understanding of the mechanisms involved in the transcriptional apparatus, we need scalable approaches that can deal with the increasing number of large-scale, high-resolution, biological datasets. In particular, such approaches need to be capable of efficiently integrating and exploiting the biological and technological heterogeneity of such datasets in order to best infer the underlying, highly dynamic regulatory networks, often in the absence of sufficient ground truth data for model training or testing. With respect to scalability, randomized approaches have proven to be a promising alternative to deterministic methods in computational biology. As an example, one of the top performing algorithms in a community challenge on gene regulatory network inference from transcriptomic data is based on a random forest regression model. In this concise survey, we aim to highlight how randomized methods may serve as a highly valuable tool, in particular, with increasing amounts of large-scale, biological experiments and datasets being collected. Given the complexity and interdisciplinary nature of the gene regulatory network inference problem, we hope our survey maybe helpful to both computational and biological scientists. It is our aim to provide a starting point for a dialogue about the concepts, benefits, and caveats of the toolbox of randomized methods, since unravelling the intricate web of highly dynamic, regulatory events will be one fundamental step in understanding the mechanisms of life and eventually developing efficient therapies to treat and cure diseases

    Identification of Nonlinear State-Space Systems from Heterogeneous Datasets

    Get PDF
    This paper proposes a new method to identify nonlinear state-space systems from heterogeneous datasets. The method is described in the context of identifying biochemical/gene networks (i.e., identifying both reaction dynamics and kinetic parameters) from experimental data. Simultaneous integration of various datasets has the potential to yield better performance for system identification. Data collected experimentally typically vary depending on the specific experimental setup and conditions. Typically, heterogeneous data are obtained experimentally through (a) replicate measurements from the same biological system or (b) application of different experimental conditions such as changes/perturbations in biological inductions, temperature, gene knock-out, gene over-expression, etc. We formulate here the identification problem using a Bayesian learning framework that makes use of “sparse group” priors to allow inference of the sparsest model that can explain the whole set of observed, heterogeneous data. To enable scale up to large number of features, the resulting non-convex optimisation problem is relaxed to a re-weighted Group Lasso problem using a convex-concave procedure. As an illustrative example of the effectiveness of our method, we use it to identify a genetic oscillator (generalised eight species repressilator). Through this example we show that our algorithm outperforms Group Lasso when the number of experiments is increased, even when each single time-series dataset is short. We additionally assess the robustness of our algorithm against noise by varying the intensity of process noise and measurement noise

    Data-driven modelling of biological multi-scale processes

    Full text link
    Biological processes involve a variety of spatial and temporal scales. A holistic understanding of many biological processes therefore requires multi-scale models which capture the relevant properties on all these scales. In this manuscript we review mathematical modelling approaches used to describe the individual spatial scales and how they are integrated into holistic models. We discuss the relation between spatial and temporal scales and the implication of that on multi-scale modelling. Based upon this overview over state-of-the-art modelling approaches, we formulate key challenges in mathematical and computational modelling of biological multi-scale and multi-physics processes. In particular, we considered the availability of analysis tools for multi-scale models and model-based multi-scale data integration. We provide a compact review of methods for model-based data integration and model-based hypothesis testing. Furthermore, novel approaches and recent trends are discussed, including computation time reduction using reduced order and surrogate models, which contribute to the solution of inference problems. We conclude the manuscript by providing a few ideas for the development of tailored multi-scale inference methods.Comment: This manuscript will appear in the Journal of Coupled Systems and Multiscale Dynamics (American Scientific Publishers

    Propagation of kinetic uncertainties through a canonical topology of the TLR4 signaling network in different regions of biochemical reaction space

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Signal transduction networks represent the information processing systems that dictate which dynamical regimes of biochemical activity can be accessible to a cell under certain circumstances. One of the major concerns in molecular systems biology is centered on the elucidation of the robustness properties and information processing capabilities of signal transduction networks. Achieving this goal requires the establishment of causal relations between the design principle of biochemical reaction systems and their emergent dynamical behaviors.</p> <p>Methods</p> <p>In this study, efforts were focused in the construction of a relatively well informed, deterministic, non-linear dynamic model, accounting for reaction mechanisms grounded on standard mass action and Hill saturation kinetics, of the canonical reaction topology underlying Toll-like receptor 4 (TLR4)-mediated signaling events. This signaling mechanism has been shown to be deployed in macrophages during a relatively short time window in response to lypopolysaccharyde (LPS) stimulation, which leads to a rapidly mounted innate immune response. An extensive computational exploration of the biochemical reaction space inhabited by this signal transduction network was performed via local and global perturbation strategies. Importantly, a broad spectrum of biologically plausible dynamical regimes accessible to the network in widely scattered regions of parameter space was reconstructed computationally. Additionally, experimentally reported transcriptional readouts of target pro-inflammatory genes, which are actively modulated by the network in response to LPS stimulation, were also simulated. This was done with the main goal of carrying out an unbiased statistical assessment of the intrinsic robustness properties of this canonical reaction topology.</p> <p>Results</p> <p>Our simulation results provide convincing numerical evidence supporting the idea that a canonical reaction mechanism of the TLR4 signaling network is capable of performing information processing in a robust manner, a functional property that is independent of the signaling task required to be executed. Nevertheless, it was found that the robust performance of the network is not solely determined by its design principle (topology), but this may be heavily dependent on the network's current position in biochemical reaction space. Ultimately, our results enabled us the identification of key rate limiting steps which most effectively control the performance of the system under diverse dynamical regimes.</p> <p>Conclusions</p> <p>Overall, our <it>in silico </it>study suggests that biologically relevant and non-intuitive aspects on the general behavior of a complex biomolecular network can be elucidated only when taking into account a wide spectrum of dynamical regimes attainable by the system. Most importantly, this strategy provides the means for a suitable assessment of the inherent variational constraints imposed by the structure of the system when systematically probing its parameter space.</p
    corecore