1,717 research outputs found

    Integrating Quantitative Knowledge into a Qualitative Gene Regulatory Network

    Get PDF
    Despite recent improvements in molecular techniques, biological knowledge remains incomplete. Any theorizing about living systems is therefore necessarily based on the use of heterogeneous and partial information. Much current research has focused successfully on the qualitative behaviors of macromolecular networks. Nonetheless, it is not capable of taking into account available quantitative information such as time-series protein concentration variations. The present work proposes a probabilistic modeling framework that integrates both kinds of information. Average case analysis methods are used in combination with Markov chains to link qualitative information about transcriptional regulations to quantitative information about protein concentrations. The approach is illustrated by modeling the carbon starvation response in Escherichia coli. It accurately predicts the quantitative time-series evolution of several protein concentrations using only knowledge of discrete gene interactions and a small number of quantitative observations on a single protein concentration. From this, the modeling technique also derives a ranking of interactions with respect to their importance during the experiment considered. Such a classification is confirmed by the literature. Therefore, our method is principally novel in that it allows (i) a hybrid model that integrates both qualitative discrete model and quantities to be built, even using a small amount of quantitative information, (ii) new quantitative predictions to be derived, (iii) the robustness and relevance of interactions with respect to phenotypic criteria to be precisely quantified, and (iv) the key features of the model to be extracted that can be used as a guidance to design future experiments

    Modeling and control of genetic regulatory networks

    Get PDF

    Optimal Bayesian Transfer Learning for Classification and Regression

    Get PDF
    Machine learning methods and algorithms working under the assumption of identically and independently distributed (i.i.d.) data cannot be applicable when dealing with massive data collected from different sources or by various technologies, where heterogeneity of data is inevitable. In such scenarios where we are far from simple homogeneous and uni-modal distributions, we should address the data heterogeneity in a smart way in order to take the best advantages of data coming from different sources. In this dissertation we study two main sources of data heterogeneity, time and domain. We address the time by modeling the dynamics of data and the domain difference by transfer learning. Gene expression data have been used for many years for phenotype classification, for instance, classification of healthy versus cancerous tissues or classification of various types of diseases. The traditional methods use static gene expression data measured in one time point. We propose to take into account the dynamics of gene interactions through time, which can be governed by gene regulatory networks (GRN), and design the classifiers using gene expression trajectories instead of static data. Thanks to recent advanced sequencing technologies such as single-cell, we are now able to look inside a single cell and capture the dynamics of gene expressions. As a result, we design optimal classifiers using single-cell gene expression trajectories, whose dynamics are modeled via Boolean networks with perturbation (BNp). We solve this problem using both expectation maximization (EM) and Bayesian framework and show the great efficacy of these methods over classification via bulk RNA-Seq data. Transfer learning (TL) has recently attracted significant research attention, as it simultaneously learns from different source domains, which have plenty of labeled data, and transfers the relevant knowledge to the target domain with limited labeled data to improve the prediction performance. We study transfer learning with a novel Bayesian viewpoint. Transfer learning appears where we do not have enough data in our target domain to train the machine learning algorithms well but have good amount of data in other relevant source domains. The probability distributions of the source and target domains might be totally different but they share some knowledge underlying the similar tasks between the domains and are related to each other in some sense. The ultimate goal of transfer learning is to find the amount of relatedness between the domains and then transfer the amount of knowledge to the target domain which can help improve the classification task in the data-poor target domain. Negative transfer is the most vital issue in transfer learning and happens when the TL algorithm is not able to detect that the source domain is not related to the target domain for a specific task. For addressing all these issues with a solid theoretical backbone, we propose a novel transfer learning method based on a Bayesian framework. We propose a Bayesian transfer learning framework, where the source and target domains are related through the joint prior distribution of the model parameters. The modeling of joint prior densities enables better understanding of the transferability between domains. Using such an idea, we propose optimal Bayesian transfer learning (OBTL) for both continuous and count data as well as optimal Bayesian transfer regression (OBTR), which are able to optimally transfer the relevant knowledge from a data-rich source domain to a data-poor target domain, whereby improving the classification accuracy in the target domain with limited data

    An Evaluation of Methods for Inferring Boolean Networks from Time-Series Data

    Get PDF
    Regulatory networks play a central role in cellular behavior and decision making. Learning these regulatory networks is a major task in biology, and devising computational methods and mathematical models for this task is a major endeavor in bioinformatics. Boolean networks have been used extensively for modeling regulatory networks. In this model, the state of each gene can be either ‘on’ or ‘off’ and that next-state of a gene is updated, synchronously or asynchronously, according to a Boolean rule that is applied to the current-state of the entire system. Inferring a Boolean network from a set of experimental data entails two main steps: first, the experimental time-series data are discretized into Boolean trajectories, and then, a Boolean network is learned from these Boolean trajectories. In this paper, we consider three methods for data discretization, including a new one we propose, and three methods for learning Boolean networks, and study the performance of all possible nine combinations on four regulatory systems of varying dynamics complexities. We find that employing the right combination of methods for data discretization and network learning results in Boolean networks that capture the dynamics well and provide predictive power. Our findings are in contrast to a recent survey that placed Boolean networks on the low end of the ‘‘faithfulness to biological reality’’ and ‘‘ability to model dynamics’’ spectra. Further, contrary to the common argument in favor of Boolean networks, we find that a relatively large number of time points in the timeseries data is required to learn good Boolean networks for certain data sets. Last but not least, while methods have been proposed for inferring Boolean networks, as discussed above, missing still are publicly available implementations thereof. Here, we make our implementation of the methods available publicly in open source at http://bioinfo.cs.rice.edu/

    A self-organized model for cell-differentiation based on variations of molecular decay rates

    Get PDF
    Systemic properties of living cells are the result of molecular dynamics governed by so-called genetic regulatory networks (GRN). These networks capture all possible features of cells and are responsible for the immense levels of adaptation characteristic to living systems. At any point in time only small subsets of these networks are active. Any active subset of the GRN leads to the expression of particular sets of molecules (expression modes). The subsets of active networks change over time, leading to the observed complex dynamics of expression patterns. Understanding of this dynamics becomes increasingly important in systems biology and medicine. While the importance of transcription rates and catalytic interactions has been widely recognized in modeling genetic regulatory systems, the understanding of the role of degradation of biochemical agents (mRNA, protein) in regulatory dynamics remains limited. Recent experimental data suggests that there exists a functional relation between mRNA and protein decay rates and expression modes. In this paper we propose a model for the dynamics of successions of sequences of active subnetworks of the GRN. The model is able to reproduce key characteristics of molecular dynamics, including homeostasis, multi-stability, periodic dynamics, alternating activity, differentiability, and self-organized critical dynamics. Moreover the model allows to naturally understand the mechanism behind the relation between decay rates and expression modes. The model explains recent experimental observations that decay-rates (or turnovers) vary between differentiated tissue-classes at a general systemic level and highlights the role of intracellular decay rate control mechanisms in cell differentiation.Comment: 16 pages, 5 figure
    • …
    corecore