19 research outputs found

    Genomic encyclopedia of sugar utilization pathways in the Shewanella genus

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Carbohydrates are a primary source of carbon and energy for many bacteria. Accurate projection of known carbohydrate catabolic pathways across diverse bacteria with complete genomes constitutes a substantial challenge due to frequent variations in components of these pathways. To address a practically and fundamentally important challenge of reconstruction of carbohydrate utilization machinery in any microorganism directly from its genomic sequence, we combined a subsystems-based comparative genomic approach with experimental validation of selected bioinformatic predictions by a combination of biochemical, genetic and physiological experiments.</p> <p>Results</p> <p>We applied this integrated approach to systematically map carbohydrate utilization pathways in 19 genomes from the <it>Shewanella </it>genus. The obtained genomic encyclopedia of sugar utilization includes ~170 protein families (mostly metabolic enzymes, transporters and transcriptional regulators) spanning 17 distinct pathways with a mosaic distribution across <it>Shewanella </it>species providing insights into their ecophysiology and adaptive evolution. Phenotypic assays revealed a remarkable consistency between predicted and observed phenotype, an ability to utilize an individual sugar as a sole source of carbon and energy, over the entire matrix of tested strains and sugars.</p> <p>Comparison of the reconstructed catabolic pathways with <it>E. coli </it>identified multiple differences that are manifested at various levels, from the presence or absence of certain sugar catabolic pathways, nonorthologous gene replacements and alternative biochemical routes to a different organization of transcription regulatory networks.</p> <p>Conclusions</p> <p>The reconstructed sugar catabolome in <it>Shewanella </it>spp includes 62 novel isofunctional families of enzymes, transporters, and regulators. In addition to improving our knowledge of genomics and functional organization of carbohydrate utilization in Shewanella, this study led to a substantial expansion of our current version of the Genomic Encyclopedia of Carbohydrate Utilization. A systematic and iterative application of this approach to multiple taxonomic groups of bacteria will further enhance it, creating a knowledge base adequate for the efficient analysis of any newly sequenced genome as well as of the emerging metagenomic data.</p

    The RAST Server: Rapid Annotations using Subsystems Technology

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them.</p> <p>Description</p> <p>We describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment.</p> <p>The service normally makes the annotated genome available within 12–24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service.</p> <p>Conclusion</p> <p>By providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.</p

    The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes

    Get PDF
    The release of the 1000(th) complete microbial genome will occur in the next two to three years. In anticipation of this milestone, the Fellowship for Interpretation of Genomes (FIG) launched the Project to Annotate 1000 Genomes. The project is built around the principle that the key to improved accuracy in high-throughput annotation technology is to have experts annotate single subsystems over the complete collection of genomes, rather than having an annotation expert attempt to annotate all of the genes in a single genome. Using the subsystems approach, all of the genes implementing the subsystem are analyzed by an expert in that subsystem. An annotation environment was created where populated subsystems are curated and projected to new genomes. A portable notion of a populated subsystem was defined, and tools developed for exchanging and curating these objects. Tools were also developed to resolve conflicts between populated subsystems. The SEED is the first annotation environment that supports this model of annotation. Here, we describe the subsystem approach, and offer the first release of our growing library of populated subsystems. The initial release of data includes 180 177 distinct proteins with 2133 distinct functional roles. This data comes from 173 subsystems and 383 different organisms

    The FGGY carbohydrate kinase family : insights into the evolution of functional specificities

    Get PDF
    © The Author(s), 2011. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in PLoS Computational Biology 7 (2011): e1002318, doi:10.1371/journal.pcbi.1002318.Function diversification in large protein families is a major mechanism driving expansion of cellular networks, providing organisms with new metabolic capabilities and thus adding to their evolutionary success. However, our understanding of the evolutionary mechanisms of functional diversity in such families is very limited, which, among many other reasons, is due to the lack of functionally well-characterized sets of proteins. Here, using the FGGY carbohydrate kinase family as an example, we built a confidently annotated reference set (CARS) of proteins by propagating experimentally verified functional assignments to a limited number of homologous proteins that are supported by their genomic and functional contexts. Then, we analyzed, on both the phylogenetic and the molecular levels, the evolution of different functional specificities in this family. The results show that the different functions (substrate specificities) encoded by FGGY kinases have emerged only once in the evolutionary history following an apparently simple divergent evolutionary model. At the same time, on the molecular level, one isofunctional group (L-ribulokinase, AraB) evolved at least two independent solutions that employed distinct specificity-determining residues for the recognition of a same substrate (L-ribulose). Our analysis provides a detailed model of the evolution of the FGGY kinase family. It also shows that only combined molecular and phylogenetic approaches can help reconstruct a full picture of functional diversifications in such diverse families.This study was funded by NIH and DOE grants

    Comparative Genomics and Experimental Characterization of N acetylglucosamine Utilization Pathway of Shewanella oneidensis

    No full text
    We used a comparative genomics approach implemented in the SEED annotation environment to reconstruct the chitin and GlcNAc utilization subsystem and regulatory network in most proteobacteria, including 11 species of Shewanella with completely sequenced genomes. Comparative analysis of candidate regulatory sites allowed us to characterize three different GlcNAc-specific regulons, NagC, NagR, and NagQ, in various proteobacteria and to tentatively assign a number of novel genes with specific functional roles, in particular new GlcNAc-related transport systems, to this subsystem. Genes SO3506 and SO3507, originally annotated as hypothetical in Shewanella oneidensis MR-1, were suggested to encode novel variants of GlcN-6-P deaminase and GlcNAc kinase, respectively. Reconstitution of the GlcNAc catabolic pathway in vitro using these purified recombinant proteins and GlcNAc-6-P deacetylase (SO3505) validated the entire pathway. Kinetic characterization of GlcN-6-P deaminase demonstrated that it is the subject of allosteric activation by GlcNAc-6-P. Consistent with genomic data, all tested Shewanella strains except S. frigidimarina, which lacked representative genes for the GlcNAc metabolism, were capable of utilizing GlcNAc as the sole source of carbon and energy. This study expands the range of carbon substrates utilized by Shewanella spp., unambiguously identifies several genes involved in chitin metabolism, and describes a novel variant of the classical three-step biochemical conversion of GlcNAc to fructose 6-phosphate first described in Escherichia coli
    corecore