218 research outputs found
Asymmetric relationships between proteins shape genome evolution
An investigation of metabolic networks in E. coli and S. cerevisiae reveals that asymmetric protein interactions affect gene expression, the relative effect of gene-knockouts and genome evolution
Co-Regulation of Metabolic Genes Is Better Explained by Flux Coupling Than by Network Distance
To what extent can modes of gene regulation be explained by systems-level properties of metabolic networks? Prior studies on co-regulation of metabolic genes have mainly focused on graph-theoretical features of metabolic networks and demonstrated a decreasing level of co-expression with increasing network distance, a naïve, but widely used, topological index. Others have suggested that static graph representations can poorly capture dynamic functional associations, e.g., in the form of dependence of metabolic fluxes across genes in the network. Here, we systematically tested the relative importance of metabolic flux coupling and network position on gene co-regulation, using a genome-scale metabolic model of Escherichia coli. After validating the computational method with empirical data on flux correlations, we confirm that genes coupled by their enzymatic fluxes not only show similar expression patterns, but also share transcriptional regulators and frequently reside in the same operon. In contrast, we demonstrate that network distance per se has relatively minor influence on gene co-regulation. Moreover, the type of flux coupling can explain refined properties of the regulatory network that are ignored by simple graph-theoretical indices. Our results underline the importance of studying functional states of cellular networks to define physiologically relevant associations between genes and should stimulate future developments of novel functional genomic tools
Correlation between sequence conservation and the genomic context after gene duplication
A key complication in comparative genomics for reliable gene function prediction is the existence of duplicated genes. To study the effect of gene duplication on function prediction, we analyze orthologs between pairs of genomes where in one genome the orthologous gene has duplicated after the speciation of the two genomes (i.e. inparalogs). For these duplicated genes we investigate whether the gene that is most similar on the sequence level is also the gene that has retained the ancestral gene-neighborhood. Although the majority of investigated cases show a consistent pattern between sequence similarity and gene-neighborhood conservation, a substantial fraction, 29–38%, is inconsistent. The observation of inconsistency is not the result of a chance outcome owing to a lack of divergence time between inparalogs, but rather it seems to be the result of a chance outcome caused by very similar rates of sequence evolution of both inparalogs relative to their ortholog. If one-to-one orthologous relationships are required, it is advisable to combine contextual information (i.e. gene-neighborhood in prokaryotes and co-expression in eukaryotes) with protein sequence information to predict the most probable functional equivalent ortholog in the presence of inparalogs
Accelerating the reconstruction of genome-scale metabolic networks
BACKGROUND: The genomic information of a species allows for the genome-scale reconstruction of its metabolic capacity. Such a metabolic reconstruction gives support to metabolic engineering, but also to integrative bioinformatics and visualization. Sequence-based automatic reconstructions require extensive manual curation, which can be very time-consuming. Therefore, we present a method to accelerate the time-consuming process of network reconstruction for a query species. The method exploits the availability of well-curated metabolic networks and uses high-resolution predictions of gene equivalency between species, allowing the transfer of gene-reaction associations from curated networks. RESULTS: We have evaluated the method using Lactococcus lactis IL1403, for which a genome-scale metabolic network was published recently. We recovered most of the gene-reaction associations (i.e. 74 – 85%) which are incorporated in the published network. Moreover, we predicted over 200 additional genes to be associated to reactions, including genes with unknown function, genes for transporters and genes with specific metabolic reactions, which are good candidates for an extension to the previously published network. In a comparison of our developed method with the well-established approach Pathologic, we predicted 186 additional genes to be associated to reactions. We also predicted a relatively high number of complete conserved protein complexes, which are derived from curated metabolic networks, illustrating the potential predictive power of our method for protein complexes. CONCLUSION: We show that our methodology can be applied to accelerate the reconstruction of genome-scale metabolic networks by taking optimal advantage of existing, manually curated networks. As orthology detection is the first step in the method, only the translated open reading frames (ORFs) of a newly sequenced genome are necessary to reconstruct a metabolic network. When more manually curated metabolic networks will become available in the near future, the usefulness of our method in network prediction is likely to increase
PhyloPat: an updated version of the phylogenetic pattern database contains gene neighborhood
Phylogenetic patterns show the presence or absence of certain genes in a set of full genomes derived from different species. They can also be used to determine sets of genes that occur only in certain evolutionary branches. Previously, we presented a database named PhyloPat which allows the complete Ensembl gene database to be queried using phylogenetic patterns. Here, we describe an updated version of PhyloPat which can be queried by an improved web server. We used a single linkage clustering algorithm to create 241 697 phylogenetic lineages, using all the orthologies provided by Ensembl v49. PhyloPat offers the possibility of querying with binary phylogenetic patterns or regular expressions, or through a phylogenetic tree of the 39 included species. Users can also input a list of Ensembl, EMBL, EntrezGene or HGNC IDs to check which phylogenetic lineage any gene belongs to. A link to the FatiGO web interface has been incorporated in the HTML output. For each gene, the surrounding genes on the chromosome, color coded according to their phylogenetic lineage can be viewed, as well as FASTA files of the peptide sequences of each lineage. Furthermore, lists of omnipresent, polypresent, oligopresent and anticorrelating genes have been included. PhyloPat is freely available at http://www.cmbi.ru.nl/phylopat
Underground metabolism as a rich reservoir for pathway engineering
Motivation: Bioproduction of value-added compounds is frequently achieved by utilizing enzymes from other species.
However, expression of such heterologous enzymes can be detrimental due to unexpected interactions within
the host cell. Recently, an alternative strategy emerged, which relies on recruiting side activities of host enzymes to
establish new biosynthetic pathways. Although such low-level ‘underground’ enzyme activities are prevalent, it
remains poorly explored whether they may serve as an important reservoir for pathway engineering.
Results: Here, we use genome-scale modeling to estimate the theoretical potential of underground reactions for engineering
novel biosynthetic pathways in Escherichia coli. We found that biochemical reactions contributed by
underground enzyme activities often enhance the in silico production of compounds with industrial importance,
including several cases where underground activities are indispensable for production. Most of these new capabilities
can be achieved by the addition of one or two underground reactions to the native network, suggesting that
only a few side activities need to be enhanced during implementation. Remarkably, we find that the contribution of
underground reactions to the production of value-added compounds is comparable to that of heterologous reactions,
underscoring their biotechnological potential. Taken together, our genome-wide study demonstrates that
exploiting underground enzyme activities could be a promising addition to the toolbox of industrial strain
development
Automatically extracting functionally equivalent proteins from SwissProt
In summary, FOSTA provides an automated analysis of annotations in UniProtKB/Swiss-Prot to enable groups of proteins already annotated as functionally equivalent, to be extracted. Our results demonstrate that the vast majority of UniProtKB/Swiss-Prot functional annotations are of high quality, and that FOSTA can interpret annotations successfully. Where FOSTA is not successful, we are able to highlight inconsistencies in UniProtKB/Swiss-Prot annotation. Most of these would have presented equal difficulties for manual interpretation of annotations. We discuss limitations and possible future extensions to FOSTA, and recommend changes to the UniProtKB/Swiss-Prot format, which would facilitate text-mining of UniProtKB/Swiss-Prot
Network-level architecture and the evolutionary potential of underground metabolism
A central unresolved issue in evolutionary biology is how metabolic innovations emerge. Low-level enzymatic side activities are frequent and can potentially be recruited for new biochemical functions. However, the role of such underground reactions in adaptation toward novel environments has remained largely unknown and out of reach of computational predictions, not least because these issues demand analyses at the level of the entire metabolic network. Here, we provide a comprehensive computational model of the underground metabolism in Escherichia coli. Most underground reactions are not isolated and 45% of them can be fully wired into the existing network and form novel pathways that produce key precursors for cell growth. This observation allowed us to conduct an integrated genome-wide in silico and experimental survey to characterize the evolutionary potential of E. coli to adapt to hundreds of nutrient conditions. We revealed that underground reactions allow growth in new environments when their activity is increased. We estimate that at least similar to 20% of the underground reactions that can be connected to the existing network confer a fitness advantage under specific environments. Moreover, our results demonstrate that the genetic basis of evolutionary adaptations via underground metabolism is computationally predictable. The approach used here has potential for various application areas from bioengineering to medical genetics
Network-based prediction of metabolic enzymes' subcellular localization
Motivation: Revealing the subcellular localization of proteins within membrane-bound compartments is of a major importance for inferring protein function. Though current high-throughput localization experiments provide valuable data, they are costly and time-consuming, and due to technical difficulties not readily applicable for many Eukaryotes. Physical characteristics of proteins, such as sequence targeting signals and amino acid composition are commonly used to predict subcellular localizations using computational approaches. Recently it was shown that protein–protein interaction (PPI) networks can be used to significantly improve the prediction accuracy of protein subcellular localization. However, as high-throughput PPI data depend on costly high-throughput experiments and are currently available for only a few organisms, the scope of such methods is yet limited
- …