2,172 research outputs found

    A temporal precedence based clustering method for gene expression microarray data

    Get PDF
    Background: Time-course microarray experiments can produce useful data which can help in understanding the underlying dynamics of the system. Clustering is an important stage in microarray data analysis where the data is grouped together according to certain characteristics. The majority of clustering techniques are based on distance or visual similarity measures which may not be suitable for clustering of temporal microarray data where the sequential nature of time is important. We present a Granger causality based technique to cluster temporal microarray gene expression data, which measures the interdependence between two time-series by statistically testing if one time-series can be used for forecasting the other time-series or not. Results: A gene-association matrix is constructed by testing temporal relationships between pairs of genes using the Granger causality test. The association matrix is further analyzed using a graph-theoretic technique to detect highly connected components representing interesting biological modules. We test our approach on synthesized datasets and real biological datasets obtained for Arabidopsis thaliana. We show the effectiveness of our approach by analyzing the results using the existing biological literature. We also report interesting structural properties of the association network commonly desired in any biological system. Conclusions: Our experiments on synthesized and real microarray datasets show that our approach produces encouraging results. The method is simple in implementation and is statistically traceable at each step. The method can produce sets of functionally related genes which can be further used for reverse-engineering of gene circuits

    Inferring causal relations from multivariate time series : a fast method for large-scale gene expression data

    Get PDF
    Various multivariate time series analysis techniques have been developed with the aim of inferring causal relations between time series. Previously, these techniques have proved their effectiveness on economic and neurophysiological data, which normally consist of hundreds of samples. However, in their applications to gene regulatory inference, the small sample size of gene expression time series poses an obstacle. In this paper, we describe some of the most commonly used multivariate inference techniques and show the potential challenge related to gene expression analysis. In response, we propose a directed partial correlation (DPC) algorithm as an efficient and effective solution to causal/regulatory relations inference on small sample gene expression data. Comparative evaluations on the existing techniques and the proposed method are presented. To draw reliable conclusions, a comprehensive benchmarking on data sets of various setups is essential. Three experiments are designed to assess these methods in a coherent manner. Detailed analysis of experimental results not only reveals good accuracy of the proposed DPC method in large-scale prediction, but also gives much insight into all methods under evaluation

    Identifying interactions in the time and frequency domains in local and global networks : a Granger causality approach

    Get PDF
    Background Reverse-engineering approaches such as Bayesian network inference, ordinary differential equations (ODEs) and information theory are widely applied to deriving causal relationships among different elements such as genes, proteins, metabolites, neurons, brain areas and so on, based upon multi-dimensional spatial and temporal data. There are several well-established reverse-engineering approaches to explore causal relationships in a dynamic network, such as ordinary differential equations (ODE), Bayesian networks, information theory and Granger Causality. Results Here we focused on Granger causality both in the time and frequency domain and in local and global networks, and applied our approach to experimental data (genes and proteins). For a small gene network, Granger causality outperformed all the other three approaches mentioned above. A global protein network of 812 proteins was reconstructed, using a novel approach. The obtained results fitted well with known experimental findings and predicted many experimentally testable results. In addition to interactions in the time domain, interactions in the frequency domain were also recovered. Conclusions The results on the proteomic data and gene data confirm that Granger causality is a simple and accurate approach to recover the network structure. Our approach is general and can be easily applied to other types of temporal data

    From gene-expressions to pathways

    Get PDF
    Rapid advancements in experimental techniques have benefited molecular biology in many ways. The experiments once considered impossible due to the lack of resources can now be performed with relative ease in an acceptable time-span; monitoring simultaneous expressions of thousands of genes at a given time point is one of them. Microarray technology is the most popular method in biological sciences to observe the simultaneous expression levels of a large number of genes. The large amount of data produced by a microarray experiment requires considerable computational analysis before some biologically meaningful hypothesis can be drawn. In contrast to a single time-point microarray experiment, the temporal microarray experiments enable us to understand the dynamics of the underlying system. Such information, if properly utilized, can provide vital clues about the structure and functioning of the system under study. This dissertation introduces some new computational techniques to process temporal microarray data. We focus on three broad stages of microarray data analysis - normalization, clustering and inference of gene-regulatory networks. We explain our methods using various synthesized datasets and a real biological dataset, produced in-house, to monitor the leaf senescence process in Arabidopsis thaliana

    Quantitative Analysis of the Effective Functional Structure in Yeast Glycolysis

    Get PDF
    Yeast glycolysis is considered the prototype of dissipative biochemical oscillators. In cellular conditions, under sinusoidal source of glucose, the activity of glycolytic enzymes can display either periodic, quasiperiodic or chaotic behavior. In order to quantify the functional connectivity for the glycolytic enzymes in dissipative conditions we have analyzed different catalytic patterns using the non-linear statistical tool of Transfer Entropy. The data were obtained by means of a yeast glycolytic model formed by three delay differential equations where the enzymatic speed functions of the irreversible stages have been explicitly considered. These enzymatic activity functions were previously modeled and tested experimentally by other different groups. In agreement with experimental conditions, the studied time series corresponded to a quasi-periodic route to chaos. The results of the analysis are three-fold: first, in addition to the classical topological structure characterized by the specific location of enzymes, substrates, products and feedback regulatory metabolites, an effective functional structure emerges in the modeled glycolytic system, which is dynamical and characterized by notable variations of the functional interactions. Second, the dynamical structure exhibits a metabolic invariant which constrains the functional attributes of the enzymes. Finally, in accordance with the classical biochemical studies, our numerical analysis reveals in a quantitative manner that the enzyme phosphofructokinase is the key-core of the metabolic system, behaving for all conditions as the main source of the effective causal flows in yeast glycolysis.Comment: Biologically improve

    Quantitative inference of dynamic regulatory pathways via microarray data

    Get PDF
    BACKGROUND: The cellular signaling pathway (network) is one of the main topics of organismic investigations. The intracellular interactions between genes in a signaling pathway are considered as the foundation of functional genomics. Thus, what genes and how much they influence each other through transcriptional binding or physical interactions are essential problems. Under the synchronous measures of gene expression via a microarray chip, an amount of dynamic information is embedded and remains to be discovered. Using a systematically dynamic modeling approach, we explore the causal relationship among genes in cellular signaling pathways from the system biology approach. RESULTS: In this study, a second-order dynamic model is developed to describe the regulatory mechanism of a target gene from the upstream causality point of view. From the expression profile and dynamic model of a target gene, we can estimate its upstream regulatory function. According to this upstream regulatory function, we would deduce the upstream regulatory genes with their regulatory abilities and activation delays, and then link up a regulatory pathway. Iteratively, these regulatory genes are considered as target genes to trace back their upstream regulatory genes. Then we could construct the regulatory pathway (or network) to the genome wide. In short, we can infer the genetic regulatory pathways from gene-expression profiles quantitatively, which can confirm some doubted paths or seek some unknown paths in a regulatory pathway (network). Finally, the proposed approach is validated by randomly reshuffling the time order of microarray data. CONCLUSION: We focus our algorithm on the inference of regulatory abilities of the identified causal genes, and how much delay before they regulate the downstream genes. With this information, a regulatory pathway would be built up using microarray data. In the present study, two signaling pathways, i.e. circadian regulatory pathway in Arabidopsis thaliana and metabolic shift pathway from fermentation to respiration in yeast Saccharomyces cerevisiae, are reconstructed using microarray data to evaluate the performance of our proposed method. In the circadian regulatory pathway, we identified mainly the interactions between the biological clock and the photoperiodic genes consistent with the known regulatory mechanisms. We also discovered the now less-known regulations between crytochrome and phytochrome. In the metabolic shift pathway, the casual relationship of enzymatic genes could be detected properly

    Modeling and identification of gene regulatory networks: A Granger causality approach

    Get PDF
    It is of increasing interest in systems biology to discover gene regulatory networks (GRNs) from time-series genomic data, i.e., to explore the interactions among a large number of genes and gene products over time. Currently, one common approach is based on Granger causality, which models the time-series genomic data as a vector autoregressive (VAR) process and estimates the GRNs from the VAR coefficient matrix. The main challenge for identification of VAR models is the high dimensionality of genes and limited number of time points, which results in statistically inefficient solution and high computational complexity. Therefore, fast and efficient variable selection techniques are highly desirable. In this paper, an introductory review of identification methods and variable selection techniques for VAR models in learning the GRNs will be presented. Furthermore, a dynamic VAR (DVAR) model, which accounts for dynamic GRNs changing with time during the experimental cycle, and its identification methods are introduced. © 2010 IEEE.published_or_final_versionThe 9th International Conference on Machine Learning and Cybernetics (ICMLC 2010), Qingdao, China, 11-14 July 2010. In Proceedings of the 9th ICMLC, 2010, v. 6, p. 3073-307

    Meta-analysis on gene regulatory networks discovered by pairwise Granger causality

    Get PDF
    Identifying regulatory genes partaking in disease development is important to medical advances. Since gene expression data of multiple experiments exist, combining results from multiple gene regulatory network discoveries offers higher sensitivity and specificity. However, data for multiple experiments on the same problem may not possess the same set of genes, and hence many existing combining methods are not applicable. In this paper, we approach this problem using a number of meta-analysis methods and compare their performances. Simulation results show that vote counting is outperformed by methods belonging to the Fisher's chi-square (FCS) family, of which FCS test is the best. Applying FCS test to the real human HeLa cell-cycle dataset, degree distributions of the combined network is obtained and compared with previous works. Consulting the BioGRID database reveals the biological relevance of gene regulatory networks discovered using the proposed method.published_or_final_versio
    corecore