20 research outputs found

    INFORMATION THEORETIC APPROACHES TOWARDS REGULATORY NETWORK INFERENCE

    Get PDF
    In spite of many efforts in the past, inference or reverse engineering of regulatory networks from microarray data remains an unsolved problem in the area of systems biology. Such regulatory networks play a critical role in cellular function and organization and are of interest in the study of a variety of disease areas and ecotoxicology to name a few. This dissertation proposes information theoretic methods/algorithms for inferring regulatory networks from microarray data. Most of the algorithms proposed in this dissertation can be implemented both on time series and multifactorial microarray data sets. The work proposed here infers regulatory networks considering the following six factors: (i) computational efficiency to infer genome-scale networks, (ii) incorporation of prior biological knowledge, (iii) choosing the optimal network that minimizes the joint network entropy, (iv) impact of higher order structures (specifically 3-node structures) on network inference (v) effects of the time sensitivity of regulatory interactions and (vi) exploiting the benefits of existing/proposed metrics and algorithms for reverse engineering using the concept of consensus of consensus networks. Specifically, this dissertation presents an approach towards incorporating knock-out data sets. The proposed method for incorporating knock-out data sets is flexible so that it can be easily adapted in existing/new approaches. While most of the information theoretic approaches infer networks based on pair-wise interactions this dissertation discusses inference methods that consider scoring edges from complex structures. A new inference method for building consensus networks based on networks inferred by multiple popular information theoretic approaches is also proposed here. For time-series datasets, new information theoretic metrics were proposed considering the time-lags of regulatory interactions estimated from microarray datasets. Finally, based on the scores predicted for each possible edge in the network, a probabilistic minimum description length based approach was proposed to identify the optimal network (minimizing the joint network entropy). Comparison analysis on in-silico and/or real time data sets have shown that the proposed algorithms achieve better inference accuracy and/or higher computational efficiency as compared with other state-of-the-art schemes such as ARACNE, CLR and Relevance Networks. Most of the methods proposed in this dissertation are generalized and can be easily incorporated into new methods/algorithms for network inference

    A Novel Gene Network Inference Algorithm Using Predictive Minimum Description Length Approach

    Get PDF
    Background: Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold which defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we proposed a new inference algorithm which incorporated mutual information (MI), conditional mutual information (CMI) and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. Results: The performance of the proposed algorithm was evaluated using both synthetic time series data sets and a biological time series data set for the yeast Saccharomyces cerevisiae. The benchmark quantities precision and recall were used as performance measures. The results show that the proposed algorithm produced less false edges and significantly improved the precision, as compared to the existing algorithm. For further analysis the performance of the algorithms was observed over different sizes of data. Conclusions: We have proposed a new algorithm that implements the PMDL principle for inferring gene regulatory networks from time series DNA microarray data that eliminates the need of a fine tuning parameter. The evaluation results obtained from both synthetic and actual biological data sets show that the PMDL principle is effective in determining the MI threshold and the developed algorithm improves precision of gene regulatory network inference. Based on the sensitivity analysis of all tested cases, an optimal CMI threshold value has been identified. Finally it was observed that the performance of the algorithms saturates at a certain threshold of data size

    NRL-Regulated Transcriptome Dynamics of Developing Rod Photoreceptors

    Get PDF
    SummaryGene regulatory networks (GRNs) guiding differentiation of cell types and cell assemblies in the nervous system are poorly understood because of inherent complexities and interdependence of signaling pathways. Here, we report transcriptome dynamics of differentiating rod photoreceptors in the mammalian retina. Given that the transcription factor NRL determines rod cell fate, we performed expression profiling of developing NRL-positive (rods) and NRL-negative (S-cone-like) mouse photoreceptors. We identified a large-scale, sharp transition in the transcriptome landscape between postnatal days 6 and 10 concordant with rod morphogenesis. Rod-specific temporal DNA methylation corroborated gene expression patterns. De novo assembly and alternative splicing analyses revealed previously unannotated rod-enriched transcripts and the role of NRL in transcript maturation. Furthermore, we defined the relationship of NRL with other transcriptional regulators and downstream cognate effectors. Our studies provide the framework for comprehensive system-level analysis of the GRN underlying the development of a single sensory neuron, the rod photoreceptor

    Multiple Environmental Stressors Induce Complex Transcriptomic Responses Indicative of Phenotypic Outcomes in Western Fence Lizard

    Get PDF
    Background The health and resilience of species in natural environments is increasingly challenged by complex anthropogenic stressor combinations including climate change, habitat encroachment, and chemical contamination. To better understand impacts of these stressors we examined the individual- and combined-stressor impacts of malaria infection, food limitation, and 2,4,6-trinitrotoluene (TNT) exposures on gene expression in livers of Western fence lizards (WFL, Sceloporus occidentalis) using custom WFL transcriptome-based microarrays. Results Computational analysis including annotation enrichment and correlation analysis identified putative functional mechanisms linking transcript expression and toxicological phenotypes. TNT exposure increased transcript expression for genes involved in erythropoiesis, potentially in response to TNT-induced anemia and/or methemoglobinemia and caused dose-specific effects on genes involved in lipid and overall energy metabolism consistent with a hormesis response of growth stimulation at low doses and adverse decreases in lizard growth at high doses. Functional enrichment results were indicative of inhibited potential for lipid mobilization and catabolism in TNT exposures which corresponded with increased inguinal fat weights and was suggestive of a decreased overall energy budget. Malaria infection elicited enriched expression of multiple immune-related functions likely corresponding to increased white blood cell (WBC) counts. Food limitation alone enriched functions related to cellular energy production and decreased expression of immune responses consistent with a decrease in WBC levels. Conclusions Despite these findings, the lizards demonstrated immune resilience to malaria infection under food limitation with transcriptional results indicating a fully competent immune response to malaria, even under bio-energetic constraints. Interestingly, both TNT and malaria individually increased transcriptional expression of immune-related genes and increased overall WBC concentrations in blood; responses that were retained in the TNT x malaria combined exposure. The results demonstrate complex and sometimes unexpected responses to multiple stressors where the lizards displayed remarkable resiliency to the stressor combinations investigated

    Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks

    Get PDF
    Background: A number of models and algorithms have been proposed in the past for gene regulatory network (GRN) inference; however, none of them address the effects of the size of time-series microarray expression data in terms of the number of time-points. In this paper, we study this problem by analyzing the behaviour of three algorithms based on information theory and dynamic Bayesian network (DBN) models. These algorithms were implemented on different sizes of data generated by synthetic networks. Experiments show that the inference accuracy of these algorithms reaches a saturation point after a specific data size brought about by a saturation in the pair-wise mutual information (MI) metric; hence there is a theoretical limit on the inference accuracy of information theory based schemes that depends on the number of time points of micro-array data used to infer GRNs. This illustrates the fact that MI might not be the best metric to use for GRN inference algorithms. To circumvent the limitations of the MI metric, we introduce a new method of computing time lags between any pair of genes and present the pair-wise time lagged Mutual Information (TLMI) and time lagged Conditional Mutual Information (TLCMI) metrics. Next we use these new metrics to propose novel GRN inference schemes which provides higher inference accuracy based on the precision and recall parameters. Results: It was observed that beyond a certain number of time-points (i.e., a specific size) of micro-array data, the performance of the algorithms measured in terms of the recall-to-precision ratio saturated due to the saturation in the calculated pair-wise MI metric with increasing data size. The proposed algorithms were compared to existing approaches on four different biological networks. The resulting networks were evaluated based on the benchmark precision and recall metrics and the results favour our approach. Conclusions: To alleviate the effects of data size on information theory based GRN inference algorithms, novel time lag based information theoretic approaches to infer gene regulatory networks have been proposed. The results show that the time lags of regulatory effects between any pair of genes play an important role in GRN inference schemes

    Poster: Gene Regulatory Network Inference Using Time Lagged Context Likelihood of Relatedness

    No full text
    In our previous work, we have shown that time lags can be incorporated in information theory based metrics to further improve the efficiency of gene regulatory network inference. In particular, we have studied the mutual information metric where we found that mutual information saturates after a certain data size. We also proposed the time lagged mutual information metric and showed that the accuracy of inference algorithms using time lagged mutual information was better. Scalability of the proposed algorithm was an issue in our previous work. CLR is one of the popular algorithms which can infer very large networks. In this poster, we propose a time lagged version of the CLR algorithm. © 2011 IEEE

    Genome Scale Inference of Transcriptional Regulatory Networks Using Mutual Information On Complex Interactions

    No full text
    Inferring the genetic network architecture in cells is of great importance to biologists as it can lead to the understanding of cell signaling and metabolic dynamics underlying cellular processes, onset of diseases, and potential discoveries in drug development. The focus today has shifted to genome scale inference approaches using information theoretic metrics such as mutual information over the gene expression data. In this paper, we propose two classes of inference algorithms using scoring schemes on complex interactions which are primarily based on information theoretic metrics. The central idea is to go beyond pair-wise interactions and utilize more complex structures between any node (gene or transcription factor) and its possible multiple regulators (only transcription factors). While this increases the network inference complexity over pair-wise interaction based approaches, it achieves much higher accuracy. We restricted the complex interactions considered in this paper to 3-node structures (any node and its two regulators) to keep our schemes scalable to genome-scale inference and yet achieve higher accuracy than other state of the art approaches. Detailed performance analyses based on benchmark precision and recall metrics over the known Escherichia coli transcriptional regulatory network, indicated that the accuracy of the proposed algorithms (sCoIn, aCoIn and its variants) is consistently higher in comparison to popular algorithms such as context likelihood of relatedness (CLR), relevance networks (RN) and GEneNetwork Inference with Ensemble of trees (GENIE3)

    Principles of Genomic Robustness Inspire Fault-Tolerant WSN Topologies: A Network Science Based Case Study

    No full text
    Wireless sensor networks (WSNs) are frameworks for modern pervasive computing infrastructures, and are often subject to operational difficulties, such as the inability to effectively mitigate signal noise or sensor failure. Natural systems, such as gene regulatory networks (GRNs), participate in similar information transport and are often subject to similar operational disruptions (noise, damage, etc.). Moreover, they self-adapt to maintain system function under adverse conditions. Using a PBN-type model valid in the operational and functional overlap between GRNs and WSNs, we study how attractors in the GRN-the target state of an evolving network-behave under selective gene or sensor failure. For larger networks, attractors are robust, in the sense that gene failures (or selective sensor failures in the WSN) conditionally increase their total number; the distance between initial states and their attractors (interpreted as the end-to-end packet delay) simultaneously decreases. Moreover, the number of attractors is conserved if the receiving sensor returns packets to the transmitting node; however, the distance to the attractors increases under similar conditions and sensor failures. Interpreting network state-transitions as packet transmission scenarios may allow for trade-offs between network topology and attractor robustness to be exploited to design novel fault-tolerant routing protocols, or other damage-mitigation strategies. © 2011 IEEE
    corecore