8 research outputs found

    Motif-directed network component analysis for regulatory network inference

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Network Component Analysis (NCA) has shown its effectiveness in discovering regulators and inferring transcription factor activities (TFAs) when both microarray data and ChIP-on-chip data are available. However, a NCA scheme is not applicable to many biological studies due to limited topology information available, such as lack of ChIP-on-chip data. We propose a new approach, motif-directed NCA (mNCA), to integrate motif information and gene expression data to infer regulatory networks.</p> <p>Results</p> <p>We develop motif-directed NCA (mNCA) to incorporate motif information into NCA for regulatory network inference. While motif information is readily available from knowledge databases, it is a "noisy" source of network topology information consisting of many false positives. To overcome this problem, we develop a stability analysis procedure embedded in mNCA to resolve the inconsistency between motif information and gene expression data, and to enable the identification of stable TFAs. The mNCA approach has been applied to a time course microarray data set of muscle regeneration. The experimental results show that the inferred TFAs are not only numerically stable but also biologically relevant to muscle differentiation process. In particular, several inferred TFAs like those of MyoD, myogenin and YY1 are well supported by biological experiments.</p> <p>Conclusion</p> <p>A novel computational approach, mNCA, has been developed to integrate motif information and gene expression data for regulatory network reconstruction. Specifically, motif analysis is used to obtain initial network topology, and stability analysis is developed and applied with mNCA to extract stable TFAs. Experimental results on muscle regeneration microarray data have demonstrated that mNCA is a practical and reliable computational method for regulatory network inference and pathway discovery.</p

    A new optimization algorithm for network component analysis based on convex programming

    Get PDF
    Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2009, p. 509-512Paper no. 2203Network component analysis (NCA) has been established as a promising tool for reconstructing gene regulatory networks from microarray data. NCA is a method that can resolve the problem of blind source separation when the mixing matrix instead has a known sparse structure despite the correlation among the source signals. The original NCA algorithm relies on alternating least squares (ALS) and suffers from local convergence as well as slow convergence. In this paper, we develop new and more robust NCA algorithms by incorporating additional signal constraints. In particular, we introduce the biologically sound constraints that all nonzero entries in the connectivity network are positive. Our new approach formulates a convex optimization problem which can be solved efficiently and effectively by fast convex programming algorithms. We verify the effectiveness and robustness of our new approach using simulations and gene regulatory network reconstruction from experimental yeast cell cycle microarray data. ©2009 IEEE.published_or_final_versio

    Bioinformatics research in the Asia Pacific: a 2007 update

    Get PDF
    We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2007 Conference was organized as the 6th annual conference of the Asia-Pacific Bioinformatics Network, on Aug. 27–30, 2007 at Hong Kong, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea) and New Delhi (India). Besides a scientific meeting at Hong Kong, satellite events organized are a pre-conference training workshop at Hanoi, Vietnam and a post-conference workshop at Nansha, China. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. We have organized the papers into thematic areas, highlighting the growing contribution of research excellence from this region, to global bioinformatics endeavours

    Reconstructing genome-wide regulatory network of E. coli using transcriptome data and predicted transcription factor activities

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene regulatory networks play essential roles in living organisms to control growth, keep internal metabolism running and respond to external environmental changes. Understanding the connections and the activity levels of regulators is important for the research of gene regulatory networks. While relevance score based algorithms that reconstruct gene regulatory networks from transcriptome data can infer genome-wide gene regulatory networks, they are unfortunately prone to false positive results. Transcription factor activities (TFAs) quantitatively reflect the ability of the transcription factor to regulate target genes. However, classic relevance score based gene regulatory network reconstruction algorithms use models do not include the TFA layer, thus missing a key regulatory element.</p> <p>Results</p> <p>This work integrates TFA prediction algorithms with relevance score based network reconstruction algorithms to reconstruct gene regulatory networks with improved accuracy over classic relevance score based algorithms. This method is called Gene expression and Transcription factor activity based Relevance Network (GTRNetwork). Different combinations of TFA prediction algorithms and relevance score functions have been applied to find the most efficient combination. When the integrated GTRNetwork method was applied to <it>E. coli </it>data, the reconstructed genome-wide gene regulatory network predicted 381 new regulatory links. This reconstructed gene regulatory network including the predicted new regulatory links show promising biological significances. Many of the new links are verified by known TF binding site information, and many other links can be verified from the literature and databases such as EcoCyc. The reconstructed gene regulatory network is applied to a recent transcriptome analysis of <it>E. coli </it>during isobutanol stress. In addition to the 16 significantly changed TFAs detected in the original paper, another 7 significantly changed TFAs have been detected by using our reconstructed network.</p> <p>Conclusions</p> <p>The GTRNetwork algorithm introduces the hidden layer TFA into classic relevance score-based gene regulatory network reconstruction processes. Integrating the TFA biological information with regulatory network reconstruction algorithms significantly improves both detection of new links and reduces that rate of false positives. The application of GTRNetwork on <it>E. coli </it>gene transcriptome data gives a set of potential regulatory links with promising biological significance for isobutanol stress and other conditions.</p

    Efficient and Robust Algorithms for Statistical Inference in Gene Regulatory Networks

    Get PDF
    Inferring gene regulatory networks (GRNs) is of profound importance in the field of computational biology and bioinformatics. Understanding the gene-gene and gene- transcription factor (TF) interactions has the potential of providing an insight into the complex biological processes taking place in cells. High-throughput genomic and proteomic technologies have enabled the collection of large amounts of data in order to quantify the gene expressions and mapping DNA-protein interactions. This dissertation investigates the problem of network component analysis (NCA) which estimates the transcription factor activities (TFAs) and gene-TF interactions by making use of gene expression and Chip-chip data. Closed-form solutions are provided for estimation of TF-gene connectivity matrix which yields advantage over the existing state-of-the-art methods in terms of lower computational complexity and higher consistency. We present an iterative reweighted ℓ2 norm based algorithm to infer the network connectivity when the prior knowledge about the connections is incomplete. We present an NCA algorithm which has the ability to counteract the presence of outliers in the gene expression data and is therefore more robust. Closed-form solutions are derived for the estimation of TFAs and TF-gene interactions and the resulting algorithm is comparable to the fastest algorithms proposed so far with the additional advantages of robustness to outliers and higher reliability in the TFA estimation. Finally, we look at the inference of gene regulatory networks which which essentially resumes to the estimation of only the gene-gene interactions. Gene networks are known to be sparse and therefore an inference algorithm is proposed which imposes a sparsity constraint while estimating the connectivity matrix.The online estimation lowers the computational complexity and provides superior performance in terms of accuracy and scalability. This dissertation presents gene regulatory network inference algorithms which provide computationally efficient solutions in some very crucial scenarios and give advantage over the existing algorithms and therefore provide means to give better understanding of underlying cellular network. Hence, it serves as a building block in the accurate estimation of gene regulatory networks which will pave the way for finding cures to genetic diseases

    TFA inference: Using mathematical modeling of gene expression data to infer the activity of transcription factors

    Get PDF
    Transcription factors (TFs) are a set of proteins that play a key role in the information processing system that enables a cell to respond to changes in internal and external state. By binding near a gene in a cell’s DNA, a TF can influence that gene’s expression level, triggering the appropriate increase or decrease in production levels of proteins that are needed to handle stressors like a change in nutrient availability or damage to the cell’s internal structures. Transcription factor activity (TFA) is a measure of how much effect a TF has on its target genes in a given sample of cells. TFA depends on several factors including expression of the gene that encodes the TF, the TF’s access to genes, and how much of the TF protein has the modifications needed to activate it. Because there are so many molecular factors influencing TF activity, there is no one assay that can measure TFA directly.In this dissertation, we build on previous work in TFA inference that uses the measurable output of cell signaling pathways – gene expression levels – to infer TFA values and to utilize these inferred values to better understand the roles of individual TFs within gene regulatory systems. First, we applied TFA inference to microarray data on the well-studied Saccharomyces cerevisiae (baker’s yeast) in order to define systematic, objective accuracy metrics. With these metrics, we explore the robustness of TFA inference to changes in the studied organism, the type of data input, and the optimization approach. Finally, we optimize the TFA inference algorithm to study RNA-seq data from a pathogenic yeast, Cryptococcus neoformans, to analyze the signaling pathway involved in its capsule formation response to environmental stress, a major factor of its virulence in humans

    Understand biological regulatory systems using computational models: Reconstruction, Analysis and Integration

    Get PDF
    Biological regulatory system is complex and involves many types of interactions, including transcriptional regulations, protein interactions, metabolic reactions and etc., to ensure the regulations of biological organisms. These regulations forms complex networks and play important roles in living organisms to adapt to the environment, control the rate of growth, and develop different phenotypes accordingly to its life cycle and the surrounding environment. Many of mechanisms and interactions of these networks are still not clear. Although better understanding of the regulatory systems is very important for biological research and engineering, to systematically reconstruct, analyze and integrate the complex regulatory systems is always challenging. At first, a novel method to reconstruct gene regulatory networks (GRNs) was developed, implemented, tested, and applied to experimental data. This method introduced a hidden transcription factor activity (TFA) layer to the conventional GRN reconstruction methods. The testing results showed significantly improved network reconstruction precision and recall comparing to conventional methods. The Application to E. coli transcriptome experimental data demonstrated the potential biological significance of the reconstructed network. A three level analysis framework to analyze TFAs and GRNs under different experimental conditions was followed up. The first level analyzes TFA patterns of individual transcription factors. The second level uses enrichment test and summarizes TFA behaviors by groups and their properties. The third level identifies key TFs of each experimental condition using network based analysis approach on effective regulatory network (ERN), a newly proposed differencial regulatory network model between experimental conditions. This analysis framework expands the traditional transcriptome data analysis to TFA and GRN level. The application to E. coli data showed the biological meaningfulness and helpfulness of analyzing transcriptome data on TFA and GRN level. At last, a comprehensive regulatory focused regulatory system model for E. coli had been constructed by integrating transcriptional regulatory networks, protein interaction networks, metabolic reaction networks, and all other related regulations. Statistical tests and network property analysis of this constructed network revealed the connection between biological functions and the special network properties of the constructed network. And simulations of the regulatory signal response of this constructed network verified the biological meaningfulness of this network
    corecore