143 research outputs found

    gcType : a high-quality type strain genome database for microbial phylogenetic and functional research

    Get PDF
    Taxonomic and functional research of microorganisms has increasingly relied upon genome-based data and methods. As the depository of the Global Catalogue of Microorganisms (GCM) 10K prokaryotic type strain sequencing project, Global Catalogue of Type Strain (gcType) has published 1049 type strain genomes sequenced by the GCM 10K project which are preserved in global culture collections with a valid published status. Additionally, the information provided through gcType includes >12 000 publicly available type strain genome sequences from GenBank incorporated using quality control criteria and standard data annotation pipelines to form a high-quality reference database. This database integrates type strain sequences with their phenotypic information to facilitate phenotypic and genotypic analyses. Multiple formats of cross-genome searches and interactive interfaces have allowed extensive exploration of the database's resources. In this study, we describe web-based data analysis pipelines for genomic analyses and genome-based taxonomy, which could serve as a one-stop platform for the identification of prokaryotic species. The number of type strain genomes that are published will continue to increase as the GCM 10K project increases its collaboration with culture collections worldwide. Data of this project is shared with the International Nucleotide Sequence Database Collaboration. Access to gcType is free at http://gctype.wdcm.org/

    eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses

    Get PDF
    eggNOG is a public database of orthology relationships, gene evolutionary histories and functional annotations. Here, we present version 5.0, featuring a major update of the underlying genome sets, which have been expanded to 4445 representative bacteria and 168 archaea derived from 25 038 genomes, as well as 477 eukaryotic organisms and 2502 viral proteomes that were selected for diversity and filtered by genome quality. In total, 4.4M orthologous groups (OGs) distributed across 379 taxonomic levels were computed together with their associated sequence alignments, phylogenies, HMM models and functional descriptors. Precomputed evolutionary analysis provides fine-grained resolution of duplication/speciation events within each OG. Our benchmarks show that, despite doubling the amount of genomes, the quality of orthology assignments and functional annotations (80% coverage) has persisted without significant changes across this update. Finally, we improved eggNOG online services for fast functional annotation and orthology prediction of custom genomics or metagenomics datasets. All precomputed data are publicly available for downloading or via API queries at http://eggnog.embl.de

    A reservoir of 'historical' antibiotic resistance genes in remote pristine Antarctic soils

    Get PDF
    Background: Soil bacteria naturally produce antibiotics as a competitive mechanism, with a concomitant evolution, and exchange by horizontal gene transfer, of a range of antibiotic resistance mechanisms. Surveys of bacterial resistance elements in edaphic systems have originated primarily from human-impacted environments, with relatively little information from remote and pristine environments, where the resistome may comprise the ancestral gene diversity. Methods: We used shotgun metagenomics to assess antibiotic resistance gene (ARG) distribution in 17 pristine and remote Antarctic surface soils within the undisturbed Mackay Glacier region. We also interrogated the phylogenetic placement of ARGs compared to environmental ARG sequences and tested for the presence of horizontal gene transfer elements flanking ARGs. Results: In total, 177 naturally occurring ARGs were identified, most of which encoded single or multi-drug efflux pumps. Resistance mechanisms for the inactivation of aminoglycosides, chloramphenicol and beta-lactam antibiotics were also common. Gram-negative bacteria harboured most ARGs (71%), with fewer genes from Gram-positive Actinobacteria and Bacilli (Firmicutes) (9%), reflecting the taxonomic composition of the soils. Strikingly, the abundance of ARGs per sample had a strong, negative correlation with species richness (r=-0.49, P < 0.05). This result, coupled with a lack of mobile genetic elements flanking ARGs, suggests that these genes are ancient acquisitions of horizontal transfer events. Conclusions: ARGs in these remote and uncontaminated soils most likely represent functional efficient historical genes that have since been vertically inherited over generations. The historical ARGs in these pristine environments carry a strong phylogenetic signal and form a monophyletic group relative to ARGs from other similar environments

    Principles of proteome allocation are revealed using proteomic data and genome-scale models

    Get PDF
    Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the "generalist" (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thus represents a generalist ME model reflecting both growth rate maximization and "hedging" against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. This flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models

    Species-level functional profiling of metagenomes and metatranscriptomes.

    Get PDF
    Functional profiles of microbial communities are typically generated using comprehensive metagenomic or metatranscriptomic sequence read searches, which are time-consuming, prone to spurious mapping, and often limited to community-level quantification. We developed HUMAnN2, a tiered search strategy that enables fast, accurate, and species-resolved functional profiling of host-associated and environmental communities. HUMAnN2 identifies a community's known species, aligns reads to their pangenomes, performs translated search on unclassified reads, and finally quantifies gene families and pathways. Relative to pure translated search, HUMAnN2 is faster and produces more accurate gene family profiles. We applied HUMAnN2 to study clinal variation in marine metabolism, ecological contribution patterns among human microbiome pathways, variation in species' genomic versus transcriptional contributions, and strain profiling. Further, we introduce 'contributional diversity' to explain patterns of ecological assembly across different microbial community types

    Electron transport phosphorylation in rumen butyrivibrios: unprecedented ATP yield for glucose fermentation to butyrate.

    Get PDF
    From a genomic analysis of rumen butyrivibrios (Butyrivibrio and Pseudobutyrivibrio sp.), we have re-evaluated the contribution of electron transport phosphorylation (ETP) to ATP formation in this group. This group is unique in that most (76%) genomes were predicted to possess genes for both Ech and Rnf transmembrane ion pumps. These pumps act in concert with the NifJ and Bcd-Etf to form a electrochemical potential (ΔμH(+) and ΔμNa(+)), which drives ATP synthesis by ETP. Of the 62 total butyrivibrio genomes currently available from the Hungate 1000 project, all 62 were predicted to possess NifJ, which reduces oxidized ferredoxin (Fdox) during pyruvate conversion to acetyl-CoA. All 62 possessed all subunits of Bcd-Etf, which reduces Fdox and oxidizes reduced NAD during crotonyl-CoA reduction. Additionally, 61 genomes possessed all subunits of the Rnf, which generates ΔμH(+) or ΔμNa(+) from oxidation of reduced Fd (Fdred) and reduction of oxidized NAD. Further, 47 genomes possessed all six subunits of the Ech, which generates ΔμH(+) from oxidation of Fdred. For glucose fermentation to butyrate and H2, the electrochemical potential established should drive synthesis of ∼1.5 ATP by the F0F1-ATP synthase (possessed by all 62 genomes). The total yield is ∼4.5 ATP/glucose after accounting for three ATP formed by classic substrate-level phosphorylation, and it is one the highest yields for any glucose fermentation. The yield was the same when unsaturated fatty acid bonds, not H(+), served as the electron acceptor (as during biohydrogenation). Possession of both Ech and Rnf had been previously documented in only a few sulfate-reducers, was rare in other rumen prokaryotic genomes in our analysis, and may confer an energetic advantage to rumen butyrivibrios. This unique energy conservation system might enhance the butyrivibrios' ability to overcome growth inhibition by unsaturated fatty acids, as postulated herein
    • …
    corecore