75 research outputs found

    Methods for large-scale Microbiome Analysis using MEGAN

    Get PDF
    The capability of next generation sequencers of emitting enormous volumes of data at a moderate cost has changed the field of sequence based research areas, such as metagenomics or studies estimating microbial diversity by using the 16S rRNA gene. While early studies investigated relatively small samples in isolation, current studies effectively target questions that require deeper sequencing of a larger number of samples. As a consequence of this development it becomes increasingly difficult to perform the computational component of the analysis on a desktop computer. As a matter of fact, even if the computationally intensive parts are outsourced to a more powerful environment, users still face datasets outgrowing the size of their home computers. This development disagrees with the policy of MEGAN - a widely accepted, powerful and user-friendly tool for metagenomics - to perform qualitative analysis on local data files. To overcome this limitation, we developed MEGANServer. MEGANServer allows bioinformaticians to retain data files on a server with sufficient resources. Furthermore, we extended MEGAN to communicate with MEGANServer and by that enable researchers to perform their analysis on a home computer regardless the actual data size. Moreover, to overcome the complexity introduced by the growing number of samples, selection of datasets of interest is automated by metadata-based grouping. In addition, following the analysis strategy of the 16S rRNA studies, datasets can be opened applying different strategies, for instance as merged data, in order to provide a deeper insight on taxonomic and/or functional distribution. Furthermore, and as a consequence of a development in which metagenomics and 16S rRNA studies are converging, we extended MEGAN to also deal with sequences that stem from a targeted approach. More precisely, we have developed a pipeline that covers the entire workflow, starting from pre-processing and, in a final step, allowing qualitative analysis using MEGAN. For that, we took advantage of a novel aligner, namely MALT, that in combination with a placement algorithm, namely the Majority Vote LCA, introduced recently in MEGAN, is not only capable of assigning more than 99\% of reads to the correct genus, but lowers the rate of false positives to a value close to 0\%. We believe that, by the additional utilization of the different data access strategies implemented in MEGANServer, MEGAN in combination with MALT and the Majority Vote algorithm is now fully capable of serving as a powerful, yet user-friendly analysis tool for 16S rRNA sequencing data.Seit der Einführung von Sequenzierern der zweiten Generation ist es möglich, große Datenmengen zu einem niedrigen Preis zu erzeugen. Angetrieben durch diesen Fortschritt zeichnet sich eine Weiterentwicklung im Studiendesign in der Metagenomik wie auch in der 16S rRNA Analyse ab. Während in früheren Studien noch einzelne und relativ kleine Proben analysiert wurden, werden heutzutage biologische Fragen gestellt, die man nur durch eine größere Anzahl an Proben und tieferer Sequenzierung beantworten kann. Eine Konsequenz aus dieser Entwicklung besteht darin, dass rechenintensive Schritte heute nicht mehr auf Heimcomputern ausgeführt werden können. Selbst eine Auslagerung dieser Schritte auf dafür spezialisierte Großrechner löst nicht das Problem der enormen Datenmengen, die für eine qualitative Analyse anschließend wieder auf den Heimcomputer kopiert werden müssen. MEGAN - ein weitverbreitetes, mächtiges und trotzdem anwenderfreundliches Programm - greift zur visuellen Aufarbeitung von Metagenomikdaten auf lokal gespeicherte Datensätze zurück. Für diesen Ansatz bedeuten die wachsenden Datenmengen ein Problem. Aus diesem Grund wurde MEGANServer entwickelt. MEGANServer erlaubt es, Datensätze auf Großrechnern zu speichern und stellt weiterhin eine Schnittstelle bereit, mit der Benutzer über MEGAN auf diese Daten zugreifen können. Außerdem wurde weitere Logik implementiert, die es dem Benutzer erleichtert Datensätze zu finden, zu vergleichen, Daten zu extrahieren und mehrere Datensätze zu kombinieren. Dadurch kann genauere Einsicht in die funktionale und taxonomische Vielfalt einer Probe erhalten werden. Da die Felder der Metagenomik und der 16S rRNA Studien, konvergieren, wurde MEGAN weiterentwickelt, um Analysen für Sequenzen aus beiden Bereichen mit hoher Qualität durchführen zu können. Dafür wurde eine Pipeline entwickelt, die mit der Qualitätskontrolle beginnt und in einem letzten Schritt qualitative Analyse und visuelle Aufarbeitung in MEGAN unterstützt. Dazu wurde der Aligner MALT mit einer neu entwickelten taxonomischen Platzierungsmethode (Majority Vote) kombiniert. Mit dieser Methodik kann die korrekte taxonomische Zuordnung auf einen Wert über 99\% Prozent erhöht werden ohne gleichzeitig die Rate der Falsch-Positiven negativ zu beeinflussen

    Anaerobutyricum hallii promotes the functional depletion of a food carcinogen in diverse healthy fecal microbiota

    Get PDF
    IntroductionAnaerobutyricum hallii is a human gut commensal that transforms the heterocyclic amine 2-amino-1-methyl-6-phenylimidazo [4,5-b] pyridine (PhIP), a carcinogen from cooked meat. The transformation mechanism involves the microbial production of acrolein from glycerol, and its conjugation with PhIP, thus blocking its mutagenic potential. A potential cancer prevention strategy could therefore involve supplementing complex human microbial communities with metabolically competent bacteria such as A. hallii that can deplete PhIP. However, it has not been established how the proportion of A. hallii in diverse healthy human gut microbial communities relates to functional capacity for PhIP transformation and, moreover, how supplementing microbiomes with A. hallii affects this function.MethodsIn this study, shotgun metagenomics was used to study taxonomic profiling, the abundance of glycerol/diol dehydratase (gdh)-harboring taxa, the proportion of resident A. hallii, and the reconstruction of A. hallii population genomes in the fecal samples of 20 healthy young adult donors. Furthermore, the influence of supplementing 106 cells/mL of A. hallii DSM 3353 with diluted fecal microbiota was characterized.Results and discussionSix microbiota were assigned to Bacteroides, nine to Prevotella, and five to Ruminococcus by enterotype-associated clustering. The total number of gdh copies in the 20 fecal microbiota expressed per 1010 bacterial cells ranged between 1.32 × 108 and 1.15 × 109. Eighteen out of the 20 donors were dominated by A. hallii, representing between 33% and 94% of the total gdh relative abundance of the samples. The microbiota with low A. hallii abundance (i.e., with a relative abundance < 1%) transformed less PhIP than the microbiota with high A. hallii abundance (i.e., with a relative abundance > 1%). Furthermore, supplementing the low-A. hallii-abundant microbiota with glycerol significantly increased the PhIP transformation capacity after 6 h while reducing total short-chain fatty acid (SCFA) levels, which is most likely due to acrolein production. Although acetate decreased in all microbiota with glycerol and with the combination of glycerol and A. hallii, for most of the microbiomes, butyrate production increased over time. Thus, for a significant number of diverse healthy human fecal microbiomes, and especially when they have little of the taxa to start with, supplementing A. hallii increases PhIP transformation. These findings suggest the need to test in vivo whether supplementing microbiomes with A. hallii reduces PhIP exposure

    NGS-pipe: a flexible, easily extendable, and highly configurable framework for NGS analysis

    Get PDF
    Next-generation sequencing is now an established method in genomics, and massive amounts of sequencing data are being generated on a regular basis. Analysis of the sequencing data is typically performed by lab-specific in-house solutions, but the agreement of results from different facilities is often small. General standards for quality control, reproducibility, and documentation are missing.; We developed NGS-pipe, a flexible, transparent, and easy-to-use framework for the design of pipelines to analyze whole-exome, whole-genome, and transcriptome sequencing data. NGS-pipe facilitates the harmonization of genomic data analysis by supporting quality control, documentation, reproducibility, parallelization, and easy adaptation to other NGS experiments. https://github.com/cbg-ethz/NGS-pipe [email protected]

    Biosynthetic potential of the global ocean microbiome

    Get PDF
    8 pages, 4 figures, supplementary information https://doi.org/10.1038/s41586-022-04862-3.-- This Article is contribution number 130 of Tara OceansNatural microbial communities are phylogenetically and metabolically diverse. In addition to underexplored organismal groups1, this diversity encompasses a rich discovery potential for ecologically and biotechnologically relevant enzymes and biochemical compounds2,3. However, studying this diversity to identify genomic pathways for the synthesis of such compounds4 and assigning them to their respective hosts remains challenging. The biosynthetic potential of microorganisms in the open ocean remains largely uncharted owing to limitations in the analysis of genome-resolved data at the global scale. Here we investigated the diversity and novelty of biosynthetic gene clusters in the ocean by integrating around 10,000 microbial genomes from cultivated and single cells with more than 25,000 newly reconstructed draft genomes from more than 1,000 seawater samples. These efforts revealed approximately 40,000 putative mostly new biosynthetic gene clusters, several of which were found in previously unsuspected phylogenetic groups. Among these groups, we identified a lineage rich in biosynthetic gene clusters (‘Candidatus Eudoremicrobiaceae’) that belongs to an uncultivated bacterial phylum and includes some of the most biosynthetically diverse microorganisms in this environment. From these, we characterized the phospeptin and pythonamide pathways, revealing cases of unusual bioactive compound structure and enzymology, respectively. Together, this research demonstrates how microbiomics-driven strategies can enable the investigation of previously undescribed enzymes and natural products in underexplored microbial groups and environmentsThis work was supported by funding from the ETH and the Helmut Horten Foundation; the Swiss National Science Foundation (SNSF) through project grants 205321_184955 to S.S., 205320_185077 to J.P. and the NCCR Microbiomes (51NF40_180575) to S.S.; by the Gordon and Betty Moore Foundation (https://doi.org/10.37807/GBMF9204) and the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 101000392 (MARBLES) to J.P.; by an ETH research grant ETH-21 18-2 to J.P.; and by the Peter and Traudl Engelhorn Foundation and by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no. 897571 to C.C.F. S.L.R. was supported by an ETH Zurich postdoctoral fellowship 20-1 FEL-07. M.L., L.M.C. and G.Z. were supported by EMBL Core Funding and the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft, project no. 395357507, SFB 1371 to G.Z.). M.B.S. was supported by the NSF grant OCE#1829831. C.B. was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement Diatomic, no. 835067). S.G.A. was supported by the Spanish Ministry of Economy and Competitiveness (PID2020-116489RB-I00). M.K. and H.M. were funded by the SNSF grant 407540_167331 as part of the Swiss National Research Programme 75 ‘Big Data’. M.K., H.M. and A.K. are also partially funded by ETH core funding (to G. Rätsch)With the institutional support of the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S)Peer reviewe

    mTAGs: taxonomic profiling using degenerate consensus reference sequences of ribosomal RNA gene

    No full text
    <p>mTAGs is a tool for the taxonomic profiling of metagenomes. It detects sequencing reads belonging to the small subunit of the ribosomal RNA (SSU-rRNA) gene and annotates them through the alignment to full-length degenerate consensus SSU-rRNA reference sequences. The tool is capable of processing single-end and pair-end metagenomic reads, takes advantage of the information contained in any region of the SSU-rRNA gene and provides relative abundance profiles at multiple taxonomic ranks (Domain, Phylum, Class, Order, Family, Genus and OTUs defined at a 97% sequence identity cutoff).</p&gt

    motu-tool/mOTUs 3.1.0

    No full text

    Genome-resolved diversity and biosynthetic potential of the coral reef microbiome

    No full text
    <p>This repository hosts the supplementary data associated with the manuscript entitled "Genome-resolved diversity and biosynthetic potential of the coral reef microbiome".</p&gt

    The Common Gut Microbe Eubacterium hallii also Contributes to Intestinal Propionate Formation

    No full text
    Eubacterium hallii is considered an important microbe in regard to intestinal metabolic balance due to its ability to utilize glucose and the fermentation intermediates acetate and lactate, to form butyrate and hydrogen. Recently, we observed that E. hallii is capable of metabolizing glycerol to 3-hydroxypropionaldehyde (3-HPA, reuterin) with reported antimicrobial properties. The key enzyme for glycerol to 3-HPA conversion is the cobalamin-dependent glycerol/diol dehydratase PduCDE which also utilizes 1,2-propanediol (1,2-PD) to form propionate. Therefore our primary goal was to investigate glycerol to 3-HPA metabolism and 1,2-PD utilization by E. hallii along with its ability to produce cobalamin. We also investigated the relative abundance of E. hallii in stool of adults using 16S rRNA and pduCDE based gene screening to determine the contribution of E. hallii to intestinal propionate formation. We found that E. hallii utilizes glycerol to produce up to 9 mM 3-HPA but did not further metabolize 3-HPA to 1,3-propanediol. Utilization of 1,2-PD in the presence and absence of glucose led to the formation of propanal, propanol and propionate. E. hallii formed cobalamin and was detected in stool of 74% of adults using 16S rRNA gene as marker gene (n = 325). Relative abundance of the E. hallii 16S rRNA gene ranged from 0 to 0.59% with a mean relative abundance of 0.044%. E. hallii PduCDE was detected in 63 to 81% of the metagenomes depending on which subunit was investigated beside other taxons such as Ruminococcus obeum, R. gnavus, Flavonifractor plautii, Intestinimonas butyriciproducens, and Veillonella spp. In conclusion, we identified E. hallii as a common gut microbe with the ability to convert glycerol to 3-HPA, a step that requires the production of cobalamin, and to utilize 1,2-PD to form propionate. Our results along with its ability to use a broad range of substrates point at E. hallii as a key species within the intestinal trophic chain with the potential to highly impact the metabolic balance as well as the gut microbiota/host homeostasis by the formation of different short chain fatty acids.ISSN:1664-302

    mTAGs: taxonomic profiling using degenerate consensus reference sequences of ribosomal RNA genes

    No full text
    Profiling the taxonomic composition of microbial communities commonly involves the classification of ribosomal RNA gene fragments. As a trade-off to maintain high classification accuracy, existing tools are typically limited to the genus level. Here, we present mTAGs, a taxonomic profiling tool that implements the alignment of metagenomic sequencing reads to degenerate consensus reference sequences of small subunit ribosomal RNA genes. It uses DNA fragments, that is, paired-end sequencing reads, as count units and provides relative abundance profiles at multiple taxonomic ranks, including operational taxonomic units based on a 97% sequence identity cutoff. At the genus rank, mTAGs outperformed other tools across several metrics, such as the F1 score by >11% across data from different environments, and achieved competitive (F1 score) or better results (Bray–Curtis dissimilarity) at the sub-genus level.ISSN:1367-4803ISSN:1460-2059ISSN:1367-4803ISSN:1460-205
    corecore