3,241 research outputs found
Recommended from our members
Identifying metabolic enzymes with multiple types of association evidence
BACKGROUND: Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. RESULTS: We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. CONCLUSION: We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities
How to identify essential genes from molecular networks?
<p>Abstract</p> <p>Background</p> <p>The prediction of essential genes from molecular networks is a way to test the understanding of essentiality in the context of what is known about the network. However, the current knowledge on molecular network structures is incomplete yet, and consequently the strategies aimed to predict essential genes are prone to uncertain predictions. We propose that simultaneously evaluating different network structures and different algorithms representing gene essentiality (centrality measures) may identify essential genes in networks in a reliable fashion.</p> <p>Results</p> <p>By simultaneously analyzing 16 different centrality measures on 18 different reconstructed metabolic networks for <it>Saccharomyces cerevisiae</it>, we show that no single centrality measure identifies essential genes from these networks in a statistically significant way; however, the combination of at least 2 centrality measures achieves a reliable prediction of most but not all of the essential genes. No improvement is achieved in the prediction of essential genes when 3 or 4 centrality measures were combined.</p> <p>Conclusion</p> <p>The method reported here describes a reliable procedure to predict essential genes from molecular networks. Our results show that essential genes may be predicted only by combining centrality measures, revealing the complex nature of the function of essential genes.</p
Automation of gene assignments to metabolic pathways using high-throughput expression data
BACKGROUND: Accurate assignment of genes to pathways is essential in order to understand the functional role of genes and to map the existing pathways in a given genome. Existing algorithms predict pathways by extrapolating experimental data in one organism to other organisms for which this data is not available. However, current systems classify all genes that belong to a specific EC family to all the pathways that contain the corresponding enzymatic reaction, and thus introduce ambiguity. RESULTS: Here we describe an algorithm for assignment of genes to cellular pathways that addresses this problem by selectively assigning specific genes to pathways. Our algorithm uses the set of experimentally elucidated metabolic pathways from MetaCyc, together with statistical models of enzyme families and expression data to assign genes to enzyme families and pathways by optimizing correlated co-expression, while minimizing conflicts due to shared assignments among pathways. Our algorithm also identifies alternative ("backup") genes and addresses the multi-domain nature of proteins. We apply our model to assign genes to pathways in the Yeast genome and compare the results for genes that were assigned experimentally. Our assignments are consistent with the experimentally verified assignments and reflect characteristic properties of cellular pathways. CONCLUSION: We present an algorithm for automatic assignment of genes to metabolic pathways. The algorithm utilizes expression data and reduces the ambiguity that characterizes assignments that are based only on EC numbers
Probabilistic methods in the analysis of protein interaction networks
Imperial Users onl
Functional Partitioning of Yeast Co-Expression Networks after Genome Duplication
Several species of yeast, including the baker's yeast Saccharomyces cerevisiae, underwent a genome duplication roughly 100 million years ago. We analyze genetic networks whose members were involved in this duplication. Many networks show detectable redundancy and strong asymmetry in their interactions. For networks of co-expressed genes, we find evidence for network partitioning whereby the paralogs appear to have formed two relatively independent subnetworks from the ancestral network. We simulate the degeneration of networks after duplication and find that a model wherein the rate of interaction loss depends on the “neighborliness” of the interacting genes produces networks with parameters similar to those seen in the real partitioned networks. We propose that the rationalization of network structure through the loss of pair-wise gene interactions after genome duplication provides a mechanism for the creation of semi-independent daughter networks through the division of ancestral functions between these daughter networks
Stability of Metabolic Correlations under Changing Environmental Conditions in Escherichia coli – A Systems Approach
Background: Biological systems adapt to changing environments by reorganizing their cellular and physiological program with metabolites representing one important response level. Different stresses lead to both conserved and specific responses on the metabolite level which should be reflected in the underlying metabolic network. Methodology/Principal Findings: Starting from experimental data obtained by a GC-MS based high-throughput metabolic profiling technology we here develop an approach that: (1) extracts network representations from metabolic condition-dependent data by using pairwise correlations, (2) determines the sets of stable and condition-dependent correlations based on a combination of statistical significance and homogeneity tests, and (3) can identify metabolites related to the stress response, which goes beyond simple observations about the changes of metabolic concentrations. The approach was tested with Escherichia coli as a model organism observed under four different environmental stress conditions (cold stress, heat stress, oxidative stress, lactose diauxie) and control unperturbed conditions. By constructing the stable network component, which displays a scale free topology and small-world characteristics, we demonstrated that: (1) metabolite hubs in this reconstructed correlation networks are significantly enriched for those contained in biochemical networks such as EcoCyc, (2) particular components of the stable network are enriched for functionally related biochemical pathways, and (3) independently of the response scale, based on their importance in the reorganization of the correlation network a set of metabolites can be identified which represent hypothetical candidates for adjusting to a stress-specific response. Conclusions/Significance: Network-based tools allowed the identification of stress-dependent and general metabolic correlation networks. This correlation-network-based approach does not rely on major changes in concentration to identify metabolites important for stress adaptation, but rather on the changes in network properties with respect to metabolites. This should represent a useful complementary technique in addition to more classical approaches
Yeast Biological Networks Unfold the Interplay of Antioxidants, Genome and Phenotype, and Reveal a Novel Regulator of the Oxidative Stress Response
Background
Identifying causative biological networks associated with relevant phenotypes is essential in the field of systems biology. We used ferulic acid (FA) as a model antioxidant to characterize the global expression programs triggered by this small molecule and decipher the transcriptional network controlling the phenotypic adaptation of the yeast Saccharomyces cerevisiae.
Methodology/Principal Findings
By employing a strict cut off value during gene expression data analysis, 106 genes were found to be involved in the cell response to FA, independent of aerobic or anaerobic conditions. Network analysis of the system guided us to a key target node, the FMP43 protein, that when deleted resulted in marked acceleration of cellular growth (~15% in both minimal and rich media). To extend our findings to human cells and identify proteins that could serve as drug targets, we replaced the yeast FMP43 protein with its human ortholog BRP44 in the genetic background of the yeast strain Δfmp43. The conservation of the two proteins was phenotypically evident, with BRP44 restoring the normal specific growth rate of the wild type. We also applied homology modeling to predict the 3D structure of the FMP43 and BRP44 proteins. The binding sites in the homology models of FMP43 and BRP44 were computationally predicted, and further docking studies were performed using FA as the ligand. The docking studies demonstrated the affinity of FA towards both FMP43 and BRP44.
Conclusions
This study proposes a hypothesis on the mechanisms yeast employs to respond to antioxidant molecules, while demonstrating how phenome and metabolome yeast data can serve as biomarkers for nutraceutical discovery and development. Additionally, we provide evidence for a putative therapeutic target, revealed by replacing the FMP43 protein with its human ortholog BRP44, a brain protein, and functionally characterizing the relevant mutant strain
Network-based analysis of gene expression data
The methods of molecular biology for the quantitative measurement of gene
expression have undergone a rapid development in the past two decades.
High-throughput assays with the microarray and RNA-seq technology now enable whole-genome studies in which several thousands of genes can be
measured at a time. However, this has also imposed serious challenges on data storage and analysis, which are subject of the young, but rapidly developing field of computational biology.
To explain observations made on such a large scale requires suitable and accordingly scaled models of gene regulation. Detailed models, as
available for single genes, need to be extended and assembled in larger networks of regulatory interactions between genes and gene products.
Incorporation of such networks into methods for data analysis is crucial to identify molecular mechanisms that are drivers of the observed expression. As methods for this purpose emerge in parallel to each other and without knowing the standard of truth, results need to be critically checked in a competitive setup and in the context of the available rich literature corpus.
This work is centered on and contributes to the following subjects, each of which represents important and distinct research topics in the field of computational biology: (i) construction of realistic gene regulatory network models; (ii) detection of subnetworks that are significantly
altered in the data under investigation; and (iii) systematic biological interpretation of detected subnetworks.
For the construction of regulatory networks, I review existing methods with a focus on curation and inference approaches. I first describe how
literature curation can be used to construct a regulatory network for a specific process, using the well-studied diauxic shift in yeast as an
example. In particular, I address the question how a detailed understanding, as available for the regulation of single genes, can be
scaled-up to the level of larger systems.
I subsequently inspect methods for large-scale network inference showing that they are significantly skewed towards master regulators.
A recalibration strategy is introduced and applied, yielding an improved genome-wide regulatory network for yeast.
To detect significantly altered subnetworks, I introduce GGEA as a method for network-based enrichment analysis. The key idea is to score regulatory interactions within functional gene sets for consistency with the observed
expression. Compared to other recently published methods, GGEA yields results that consistently and coherently align expression changes with
known regulation types and that are thus easier to explain. I also suggest and discuss several significant enhancements to the original method that are improving its applicability, outcome and runtime.
For the systematic detection and interpretation of subnetworks, I have developed the EnrichmentBrowser software package. It implements several state-of-the-art methods besides GGEA, and allows to combine and explore results across methods. As part of the Bioconductor repository, the package provides a unified access to the different methods and, thus, greatly simplifies the usage for biologists. Extensions to this framework, that support automating of biological interpretation routines, are also presented.
In conclusion, this work contributes substantially to the research field of network-based analysis of gene expression data with respect to regulatory network construction, subnetwork detection, and their biological interpretation. This also includes recent developments as well as areas of ongoing research, which are discussed in the context of
current and future questions arising from the new generation of genomic data
A methodology for detecting the orthology signal in a PPI network at a functional complex level
Contains fulltext :
93730.pdf (preprint version ) (Open Access)13 p
- …