3,241 research outputs found

    How to identify essential genes from molecular networks?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The prediction of essential genes from molecular networks is a way to test the understanding of essentiality in the context of what is known about the network. However, the current knowledge on molecular network structures is incomplete yet, and consequently the strategies aimed to predict essential genes are prone to uncertain predictions. We propose that simultaneously evaluating different network structures and different algorithms representing gene essentiality (centrality measures) may identify essential genes in networks in a reliable fashion.</p> <p>Results</p> <p>By simultaneously analyzing 16 different centrality measures on 18 different reconstructed metabolic networks for <it>Saccharomyces cerevisiae</it>, we show that no single centrality measure identifies essential genes from these networks in a statistically significant way; however, the combination of at least 2 centrality measures achieves a reliable prediction of most but not all of the essential genes. No improvement is achieved in the prediction of essential genes when 3 or 4 centrality measures were combined.</p> <p>Conclusion</p> <p>The method reported here describes a reliable procedure to predict essential genes from molecular networks. Our results show that essential genes may be predicted only by combining centrality measures, revealing the complex nature of the function of essential genes.</p

    Automation of gene assignments to metabolic pathways using high-throughput expression data

    Get PDF
    BACKGROUND: Accurate assignment of genes to pathways is essential in order to understand the functional role of genes and to map the existing pathways in a given genome. Existing algorithms predict pathways by extrapolating experimental data in one organism to other organisms for which this data is not available. However, current systems classify all genes that belong to a specific EC family to all the pathways that contain the corresponding enzymatic reaction, and thus introduce ambiguity. RESULTS: Here we describe an algorithm for assignment of genes to cellular pathways that addresses this problem by selectively assigning specific genes to pathways. Our algorithm uses the set of experimentally elucidated metabolic pathways from MetaCyc, together with statistical models of enzyme families and expression data to assign genes to enzyme families and pathways by optimizing correlated co-expression, while minimizing conflicts due to shared assignments among pathways. Our algorithm also identifies alternative ("backup") genes and addresses the multi-domain nature of proteins. We apply our model to assign genes to pathways in the Yeast genome and compare the results for genes that were assigned experimentally. Our assignments are consistent with the experimentally verified assignments and reflect characteristic properties of cellular pathways. CONCLUSION: We present an algorithm for automatic assignment of genes to metabolic pathways. The algorithm utilizes expression data and reduces the ambiguity that characterizes assignments that are based only on EC numbers

    Functional Partitioning of Yeast Co-Expression Networks after Genome Duplication

    Get PDF
    Several species of yeast, including the baker's yeast Saccharomyces cerevisiae, underwent a genome duplication roughly 100 million years ago. We analyze genetic networks whose members were involved in this duplication. Many networks show detectable redundancy and strong asymmetry in their interactions. For networks of co-expressed genes, we find evidence for network partitioning whereby the paralogs appear to have formed two relatively independent subnetworks from the ancestral network. We simulate the degeneration of networks after duplication and find that a model wherein the rate of interaction loss depends on the “neighborliness” of the interacting genes produces networks with parameters similar to those seen in the real partitioned networks. We propose that the rationalization of network structure through the loss of pair-wise gene interactions after genome duplication provides a mechanism for the creation of semi-independent daughter networks through the division of ancestral functions between these daughter networks

    Stability of Metabolic Correlations under Changing Environmental Conditions in Escherichia coli – A Systems Approach

    Get PDF
    Background: Biological systems adapt to changing environments by reorganizing their cellular and physiological program with metabolites representing one important response level. Different stresses lead to both conserved and specific responses on the metabolite level which should be reflected in the underlying metabolic network. Methodology/Principal Findings: Starting from experimental data obtained by a GC-MS based high-throughput metabolic profiling technology we here develop an approach that: (1) extracts network representations from metabolic condition-dependent data by using pairwise correlations, (2) determines the sets of stable and condition-dependent correlations based on a combination of statistical significance and homogeneity tests, and (3) can identify metabolites related to the stress response, which goes beyond simple observations about the changes of metabolic concentrations. The approach was tested with Escherichia coli as a model organism observed under four different environmental stress conditions (cold stress, heat stress, oxidative stress, lactose diauxie) and control unperturbed conditions. By constructing the stable network component, which displays a scale free topology and small-world characteristics, we demonstrated that: (1) metabolite hubs in this reconstructed correlation networks are significantly enriched for those contained in biochemical networks such as EcoCyc, (2) particular components of the stable network are enriched for functionally related biochemical pathways, and (3) independently of the response scale, based on their importance in the reorganization of the correlation network a set of metabolites can be identified which represent hypothetical candidates for adjusting to a stress-specific response. Conclusions/Significance: Network-based tools allowed the identification of stress-dependent and general metabolic correlation networks. This correlation-network-based approach does not rely on major changes in concentration to identify metabolites important for stress adaptation, but rather on the changes in network properties with respect to metabolites. This should represent a useful complementary technique in addition to more classical approaches

    Yeast Biological Networks Unfold the Interplay of Antioxidants, Genome and Phenotype, and Reveal a Novel Regulator of the Oxidative Stress Response

    Get PDF
    Background Identifying causative biological networks associated with relevant phenotypes is essential in the field of systems biology. We used ferulic acid (FA) as a model antioxidant to characterize the global expression programs triggered by this small molecule and decipher the transcriptional network controlling the phenotypic adaptation of the yeast Saccharomyces cerevisiae. Methodology/Principal Findings By employing a strict cut off value during gene expression data analysis, 106 genes were found to be involved in the cell response to FA, independent of aerobic or anaerobic conditions. Network analysis of the system guided us to a key target node, the FMP43 protein, that when deleted resulted in marked acceleration of cellular growth (~15% in both minimal and rich media). To extend our findings to human cells and identify proteins that could serve as drug targets, we replaced the yeast FMP43 protein with its human ortholog BRP44 in the genetic background of the yeast strain Δfmp43. The conservation of the two proteins was phenotypically evident, with BRP44 restoring the normal specific growth rate of the wild type. We also applied homology modeling to predict the 3D structure of the FMP43 and BRP44 proteins. The binding sites in the homology models of FMP43 and BRP44 were computationally predicted, and further docking studies were performed using FA as the ligand. The docking studies demonstrated the affinity of FA towards both FMP43 and BRP44. Conclusions This study proposes a hypothesis on the mechanisms yeast employs to respond to antioxidant molecules, while demonstrating how phenome and metabolome yeast data can serve as biomarkers for nutraceutical discovery and development. Additionally, we provide evidence for a putative therapeutic target, revealed by replacing the FMP43 protein with its human ortholog BRP44, a brain protein, and functionally characterizing the relevant mutant strain

    Network-based analysis of gene expression data

    Get PDF
    The methods of molecular biology for the quantitative measurement of gene expression have undergone a rapid development in the past two decades. High-throughput assays with the microarray and RNA-seq technology now enable whole-genome studies in which several thousands of genes can be measured at a time. However, this has also imposed serious challenges on data storage and analysis, which are subject of the young, but rapidly developing field of computational biology. To explain observations made on such a large scale requires suitable and accordingly scaled models of gene regulation. Detailed models, as available for single genes, need to be extended and assembled in larger networks of regulatory interactions between genes and gene products. Incorporation of such networks into methods for data analysis is crucial to identify molecular mechanisms that are drivers of the observed expression. As methods for this purpose emerge in parallel to each other and without knowing the standard of truth, results need to be critically checked in a competitive setup and in the context of the available rich literature corpus. This work is centered on and contributes to the following subjects, each of which represents important and distinct research topics in the field of computational biology: (i) construction of realistic gene regulatory network models; (ii) detection of subnetworks that are significantly altered in the data under investigation; and (iii) systematic biological interpretation of detected subnetworks. For the construction of regulatory networks, I review existing methods with a focus on curation and inference approaches. I first describe how literature curation can be used to construct a regulatory network for a specific process, using the well-studied diauxic shift in yeast as an example. In particular, I address the question how a detailed understanding, as available for the regulation of single genes, can be scaled-up to the level of larger systems. I subsequently inspect methods for large-scale network inference showing that they are significantly skewed towards master regulators. A recalibration strategy is introduced and applied, yielding an improved genome-wide regulatory network for yeast. To detect significantly altered subnetworks, I introduce GGEA as a method for network-based enrichment analysis. The key idea is to score regulatory interactions within functional gene sets for consistency with the observed expression. Compared to other recently published methods, GGEA yields results that consistently and coherently align expression changes with known regulation types and that are thus easier to explain. I also suggest and discuss several significant enhancements to the original method that are improving its applicability, outcome and runtime. For the systematic detection and interpretation of subnetworks, I have developed the EnrichmentBrowser software package. It implements several state-of-the-art methods besides GGEA, and allows to combine and explore results across methods. As part of the Bioconductor repository, the package provides a unified access to the different methods and, thus, greatly simplifies the usage for biologists. Extensions to this framework, that support automating of biological interpretation routines, are also presented. In conclusion, this work contributes substantially to the research field of network-based analysis of gene expression data with respect to regulatory network construction, subnetwork detection, and their biological interpretation. This also includes recent developments as well as areas of ongoing research, which are discussed in the context of current and future questions arising from the new generation of genomic data
    corecore