718 research outputs found

    Intervention in Context-Sensitive Probabilistic Boolean Networks Revisited

    Get PDF
    An approximate representation for the state space of a context-sensitive probabilistic Boolean network has previously been proposed and utilized to devise therapeutic intervention strategies. Whereas the full state of a context-sensitive probabilistic Boolean network is specified by an ordered pair composed of a network context and a gene-activity profile, this approximate representation collapses the state space onto the gene-activity profiles alone. This reduction yields an approximate transition probability matrix, absent of context, for the Markov chain associated with the context-sensitive probabilistic Boolean network. As with many approximation methods, a price must be paid for using a reduced model representation, namely, some loss of optimality relative to using the full state space. This paper examines the effects on intervention performance caused by the reduction with respect to various values of the model parameters. This task is performed using a new derivation for the transition probability matrix of the context-sensitive probabilistic Boolean network. This expression of transition probability distributions is in concert with the original definition of context-sensitive probabilistic Boolean network. The performance of optimal and approximate therapeutic strategies is compared for both synthetic networks and a real case study. It is observed that the approximate representation describes the dynamics of the context-sensitive probabilistic Boolean network through the instantaneously random probabilistic Boolean network with similar parameters

    On optimal control policy for Probabilistic Boolean Network: a state reduction approach

    Get PDF
    BACKGROUND: Probabilistic Boolean Network (PBN) is a popular model for studying genetic regulatory networks. An important and practical problem is to find the optimal control policy for a PBN so as to avoid the network from entering into undesirable states. A number of research works have been done by using dynamic programming-based (DP) method. However, due to the high computational complexity of PBNs, DP method is computationally inefficient for a large size network. Therefore it is natural to seek for approximation methods. RESULTS: Inspired by the state reduction strategies, we consider using dynamic programming in conjunction with state reduction approach to reduce the computational cost of the DP method. Numerical examples are given to demonstrate both the effectiveness and the efficiency of our proposed method. CONCLUSIONS: Finding the optimal control policy for PBNs is meaningful. The proposed problem has been shown to be ∑ p 2 - hard . By taking state reduction approach into consideration, the proposed method can speed up the computational time in applying dynamic programming-based algorithm. In particular, the proposed method is effective for larger size networks.published_or_final_versio

    Reconstructing gene-regulatory networks from time series, knock-out data, and prior knowledge

    Get PDF
    BACKGROUND: Cellular processes are controlled by gene-regulatory networks. Several computational methods are currently used to learn the structure of gene-regulatory networks from data. This study focusses on time series gene expression and gene knock-out data in order to identify the underlying network structure. We compare the performance of different network reconstruction methods using synthetic data generated from an ensemble of reference networks. Data requirements as well as optimal experiments for the reconstruction of gene-regulatory networks are investigated. Additionally, the impact of prior knowledge on network reconstruction as well as the effect of unobserved cellular processes is studied. RESULTS: We identify linear Gaussian dynamic Bayesian networks and variable selection based on F-statistics as suitable methods for the reconstruction of gene-regulatory networks from time series data. Commonly used discrete dynamic Bayesian networks perform inferior and this result can be attributed to the inevitable information loss by discretization of expression data. It is shown that short time series generated under transcription factor knock-out are optimal experiments in order to reveal the structure of gene regulatory networks. Relative to the level of observational noise, we give estimates for the required amount of gene expression data in order to accurately reconstruct gene-regulatory networks. The benefit of using of prior knowledge within a Bayesian learning framework is found to be limited to conditions of small gene expression data size. Unobserved processes, like protein-protein interactions, induce dependencies between gene expression levels similar to direct transcriptional regulation. We show that these dependencies cannot be distinguished from transcription factor mediated gene regulation on the basis of gene expression data alone. CONCLUSION: Currently available data size and data quality make the reconstruction of gene networks from gene expression data a challenge. In this study, we identify an optimal type of experiment, requirements on the gene expression data quality and size as well as appropriate reconstruction methods in order to reverse engineer gene regulatory networks from time series data

    Inference of a Probabilistic Boolean Network from a Single Observed Temporal Sequence

    Get PDF
    The inference of gene regulatory networks is a key issue for genomic signal processing. This paper addresses the inference of probabilistic Boolean networks (PBNs) from observed temporal sequences of network states. Since a PBN is composed of a finite number of Boolean networks, a basic observation is that the characteristics of a single Boolean network without perturbation may be determined by its pairwise transitions. Because the network function is fixed and there are no perturbations, a given state will always be followed by a unique state at the succeeding time point. Thus, a transition counting matrix compiled over a data sequence will be sparse and contain only one entry per line. If the network also has perturbations, with small perturbation probability, then the transition counting matrix would have some insignificant nonzero entries replacing some (or all) of the zeros. If a data sequence is sufficiently long to adequately populate the matrix, then determination of the functions and inputs underlying the model is straightforward. The difficulty comes when the transition counting matrix consists of data derived from more than one Boolean network. We address the PBN inference procedure in several steps: (1) separate the data sequence into "pure" subsequences corresponding to constituent Boolean networks; (2) given a subsequence, infer a Boolean network; and (3) infer the probabilities of perturbation, the probability of there being a switch between constituent Boolean networks, and the selection probabilities governing which network is to be selected given a switch. Capturing the full dynamic behavior of probabilistic Boolean networks, be they binary or multivalued, will require the use of temporal data, and a great deal of it. This should not be surprising given the complexity of the model and the number of parameters, both transitional and static, that must be estimated. In addition to providing an inference algorithm, this paper demonstrates that the data requirement is much smaller if one does not wish to infer the switching, perturbation, and selection probabilities, and that constituent-network connectivity can be discovered with decent accuracy for relatively small time-course sequences

    Optimal Bayesian Transfer Learning for Classification and Regression

    Get PDF
    Machine learning methods and algorithms working under the assumption of identically and independently distributed (i.i.d.) data cannot be applicable when dealing with massive data collected from different sources or by various technologies, where heterogeneity of data is inevitable. In such scenarios where we are far from simple homogeneous and uni-modal distributions, we should address the data heterogeneity in a smart way in order to take the best advantages of data coming from different sources. In this dissertation we study two main sources of data heterogeneity, time and domain. We address the time by modeling the dynamics of data and the domain difference by transfer learning. Gene expression data have been used for many years for phenotype classification, for instance, classification of healthy versus cancerous tissues or classification of various types of diseases. The traditional methods use static gene expression data measured in one time point. We propose to take into account the dynamics of gene interactions through time, which can be governed by gene regulatory networks (GRN), and design the classifiers using gene expression trajectories instead of static data. Thanks to recent advanced sequencing technologies such as single-cell, we are now able to look inside a single cell and capture the dynamics of gene expressions. As a result, we design optimal classifiers using single-cell gene expression trajectories, whose dynamics are modeled via Boolean networks with perturbation (BNp). We solve this problem using both expectation maximization (EM) and Bayesian framework and show the great efficacy of these methods over classification via bulk RNA-Seq data. Transfer learning (TL) has recently attracted significant research attention, as it simultaneously learns from different source domains, which have plenty of labeled data, and transfers the relevant knowledge to the target domain with limited labeled data to improve the prediction performance. We study transfer learning with a novel Bayesian viewpoint. Transfer learning appears where we do not have enough data in our target domain to train the machine learning algorithms well but have good amount of data in other relevant source domains. The probability distributions of the source and target domains might be totally different but they share some knowledge underlying the similar tasks between the domains and are related to each other in some sense. The ultimate goal of transfer learning is to find the amount of relatedness between the domains and then transfer the amount of knowledge to the target domain which can help improve the classification task in the data-poor target domain. Negative transfer is the most vital issue in transfer learning and happens when the TL algorithm is not able to detect that the source domain is not related to the target domain for a specific task. For addressing all these issues with a solid theoretical backbone, we propose a novel transfer learning method based on a Bayesian framework. We propose a Bayesian transfer learning framework, where the source and target domains are related through the joint prior distribution of the model parameters. The modeling of joint prior densities enables better understanding of the transferability between domains. Using such an idea, we propose optimal Bayesian transfer learning (OBTL) for both continuous and count data as well as optimal Bayesian transfer regression (OBTR), which are able to optimally transfer the relevant knowledge from a data-rich source domain to a data-poor target domain, whereby improving the classification accuracy in the target domain with limited data

    Optimal Bayesian Transfer Learning for Classification and Regression

    Get PDF
    Machine learning methods and algorithms working under the assumption of identically and independently distributed (i.i.d.) data cannot be applicable when dealing with massive data collected from different sources or by various technologies, where heterogeneity of data is inevitable. In such scenarios where we are far from simple homogeneous and uni-modal distributions, we should address the data heterogeneity in a smart way in order to take the best advantages of data coming from different sources. In this dissertation we study two main sources of data heterogeneity, time and domain. We address the time by modeling the dynamics of data and the domain difference by transfer learning. Gene expression data have been used for many years for phenotype classification, for instance, classification of healthy versus cancerous tissues or classification of various types of diseases. The traditional methods use static gene expression data measured in one time point. We propose to take into account the dynamics of gene interactions through time, which can be governed by gene regulatory networks (GRN), and design the classifiers using gene expression trajectories instead of static data. Thanks to recent advanced sequencing technologies such as single-cell, we are now able to look inside a single cell and capture the dynamics of gene expressions. As a result, we design optimal classifiers using single-cell gene expression trajectories, whose dynamics are modeled via Boolean networks with perturbation (BNp). We solve this problem using both expectation maximization (EM) and Bayesian framework and show the great efficacy of these methods over classification via bulk RNA-Seq data. Transfer learning (TL) has recently attracted significant research attention, as it simultaneously learns from different source domains, which have plenty of labeled data, and transfers the relevant knowledge to the target domain with limited labeled data to improve the prediction performance. We study transfer learning with a novel Bayesian viewpoint. Transfer learning appears where we do not have enough data in our target domain to train the machine learning algorithms well but have good amount of data in other relevant source domains. The probability distributions of the source and target domains might be totally different but they share some knowledge underlying the similar tasks between the domains and are related to each other in some sense. The ultimate goal of transfer learning is to find the amount of relatedness between the domains and then transfer the amount of knowledge to the target domain which can help improve the classification task in the data-poor target domain. Negative transfer is the most vital issue in transfer learning and happens when the TL algorithm is not able to detect that the source domain is not related to the target domain for a specific task. For addressing all these issues with a solid theoretical backbone, we propose a novel transfer learning method based on a Bayesian framework. We propose a Bayesian transfer learning framework, where the source and target domains are related through the joint prior distribution of the model parameters. The modeling of joint prior densities enables better understanding of the transferability between domains. Using such an idea, we propose optimal Bayesian transfer learning (OBTL) for both continuous and count data as well as optimal Bayesian transfer regression (OBTR), which are able to optimally transfer the relevant knowledge from a data-rich source domain to a data-poor target domain, whereby improving the classification accuracy in the target domain with limited data
    corecore