Search CORE

474 research outputs found

Evaluation of shotgun metagenomics sequence classification methods using in silico and in vitro simulated communities

Author: A Brady
A Brady
A Pati
AE Darling
AL Bazinet
AM Bolger
B Buchfink
B Langmead
B Langmead
B Liu
B Niu
C Ander
C Kerepesi
C Mering von
CF Davenport
CT Brown
DC Richter
DE Wood
DH Huson
DH Huson
DH Huson
DH Huson
DH Parks
F Meyer
F Yu
Fiona S. L. Brinkman
G Rosen
GL Rosen
H Jiang
H Klingenberg
H Li
IB Rogozin
J Berendzen
J Handelsman
J Liu
JC Wooley
K Garcia-Etxebarria
KE Nelson
KR Patil
KR Patil
L Krause
M Fukushima
M Monzoorul Haque
M Stark
M Wu
M Wu
MC Frith
MH Mohammed
MH Mohammed
Michael A. Peabody
MS Porter
MS Rappé
N Fierer
N Segata
NJ MacDonald
NN Diaz
OA Økstad
OU Nalbantoglu
PJ Turnbaugh
R Amann
R Ghai
R Ounit
R Seshadri
RA Edwards
Raymond Lo
RJ Smith
RM Reddy
S Boisvert
S Mitra
S Mitra
S Oh
S Sun
S Sunagawa
S Wu
SF Altschul
SG Acinas
SK Ames
SR Eddy
SS Mande
T Větrovský
TAK Freitas
Thea Van Rossum
TS Ghosh
VK Sharma
W Gerlach
W Gerlach
Y Zhao
Z Rasheed
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Metagenomic Sequencing of an In Vitro-Simulated Microbial Community

Author: Aaron E. Darling
Francisco Rodriguez-Valera
Jenna L. Morgan
Jonathan A. Eisen
Publication venue: Public Library of Science
Publication date: 01/12/2009
Field of study

Background: Microbial life dominates the earth, but many species are difficult or even impossible to study under laboratory conditions. Sequencing DNA directly from the environment, a technique commonly referred to as metagenomics, is an important tool for cataloging microbial life. This culture-independent approach involves collecting samples that include microbes in them, extracting DNA from the samples, and sequencing the DNA. A sample may contain many different microorganisms, macroorganisms, and even free-floating environmental DNA. A fundamental challenge in metagenomics has been estimating the abundance of organisms in a sample based on the frequency with which the organism's DNA was observed in reads generated via DNA sequencing. Methodology/Principal Findings: We created mixtures of ten microbial species for which genome sequences are known. Each mixture contained an equal number of cells of each species. We then extracted DNA from the mixtures, sequenced the DNA, and measured the frequency with which genomic regions from each organism was observed in the sequenced DNA. We found that the observed frequency of reads mapping to each organism did not reflect the equal numbers of cells that were known to be included in each mixture. The relative organism abundances varied significantly depending on the DNA extraction and sequencing protocol utilized. Conclusions/Significance: We describe a new data resource for measuring the accuracy of metagenomic binning methods, created by in vitro-simulation of a metagenomic community. Our in vitro simulation can be used to complement previous in silico benchmark studies. In constructing a synthetic community and sequencing its metagenome, we encountered several sources of observation bias that likely affect most metagenomic experiments to date and present challenges for comparative metagenomic studies. DNA preparation methods have a particularly profound effect in our study, implying that samples prepared with different protocols are not suitable for comparative metagenomics

Crossref

OPUS - University of Technology Sydney

Directory of Open Access Journals

PubMed Central

UNT Digital Library

MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads

Author: Aarestrup Frank Møller
Lukjancenko Oksana
Lund Ole
Petersen Thomas Nordahl
Sicheritz-Pontén Thomas
Sperotto Maria Maddalena
Thomsen Martin Christen Frølund
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2017
Field of study

An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods. After applying a post-processing filter, we obtained 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70.5% for Kraken and both methods identified all species with no false positives. Extensive read count statistics are provided in plain text and excel sheets for both rejected and accepted taxonomy annotations. The use of custom databases is possible for the command-line version of MGmapper, and the complete pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets

Crossref

Directory of Open Access Journals

Online Research Database In Technology

FigShare

PhylOTU: a high-throughput procedure quantifies microbial community diversity and resolves novel taxa from metagenomic data.

Author: Eisen Jonathan A
Green Jessica L
Kembel Steven W
Ladau Joshua
O'Dwyer James P
Pollard Katherine S
Riesenfeld Samantha J
Sharpton Thomas J
Publication venue: eScholarship, University of California
Publication date: 01/01/2011
Field of study

Microbial diversity is typically characterized by clustering ribosomal RNA (SSU-rRNA) sequences into operational taxonomic units (OTUs). Targeted sequencing of environmental SSU-rRNA markers via PCR may fail to detect OTUs due to biases in priming and amplification. Analysis of shotgun sequenced environmental DNA, known as metagenomics, avoids amplification bias but generates fragmentary, non-overlapping sequence reads that cannot be clustered by existing OTU-finding methods. To circumvent these limitations, we developed PhylOTU, a computational workflow that identifies OTUs from metagenomic SSU-rRNA sequence data through the use of phylogenetic principles and probabilistic sequence profiles. Using simulated metagenomic data, we quantified the accuracy with which PhylOTU clusters reads into OTUs. Comparisons of PCR and shotgun sequenced SSU-rRNA markers derived from the global open ocean revealed that while PCR libraries identify more OTUs per sequenced residue, metagenomic libraries recover a greater taxonomic diversity of OTUs. In addition, we discover novel species, genera and families in the metagenomic libraries, including OTUs from phyla missed by analysis of PCR sequences. Taken together, these results suggest that PhylOTU enables characterization of part of the biosphere currently hidden from PCR-based surveys of diversity

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking.

Author: Arron Shiffer
Benjamin Wolfe
Corinne F. Maurice
J. Gregory Caporaso
Jai Ram Rideout
Josh D. Neufeld
Nicholas A. Bokulich
Peter J. Turnbaugh
Rachel J. Dutton
Rob Knight
William G. Mercurio
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. IMPORTANCE The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community

Repository for Publications and Research Data

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Taxonomic classification method for metagenomics based on core protein families with Core-Kaiju

Author: Cosentino Lagomarsino Marco
Krogh Anders
Menzel Peter
Suweis Samir
Tovo Anna
Publication venue
Publication date: 01/01/2020
Field of study

Abstract Characterizing species diversity and composition of bacteria hosted by biota is revolutionizing our understanding of the role of symbiotic interactions in ecosystems. Determining microbiomes diversity implies the assignment of individual reads to taxa by comparison to reference databases. Although computational methods aimed at identifying the microbe(s) taxa are available, it is well known that inferences using different methods can vary widely depending on various biases. In this study, we first apply and compare different bioinformatics methods based on 16S ribosomal RNA gene and shotgun sequencing to three mock communities of bacteria, of which the compositions are known. We show that none of these methods can infer both the true number of taxa and their abundances. We thus propose a novel approach, named Core-Kaiju, which combines the power of shotgun metagenomics data with a more focused marker gene classification method similar to 16S, but based on emergent statistics of core protein domain families. We thus test the proposed method on various mock communities and we show that Core-Kaiju reliably predicts both number of taxa and abundances. Finally, we apply our method on human gut samples, showing how Core-Kaiju may give more accurate ecological characterization and a fresh view on real microbiomes

AIR Universita degli studi di Milano

Copenhagen University Research Information System

Open Access Repository

Archivio istituzionale della ricerca - Università di Padova

Recommended from our members

Optimizing sequencing protocols for leaderboard metagenomics by combining long and short reads.

Author: Arthur Timothy D
Bankevich Anton
Boland Brigid S
Brennan Caitriona
Chang John T
Chen Feng
Conrad Douglas J
Dang Jason W
Dorrestein Pieter C
Fedarko Marcus
Gaffney James
Green Cliff
Humphrey Greg C
Jepsen Kristen
Khosroheidari Mahdieh
Knight Rob
Liyanage Marlon
Martino Cameron
Minich Jeremiah
Nurk Sergey
Pevzner Pavel A
Phelan Vanessa V
Quinn Robert A
Rana Tariq M
Salido Rodolfo A
Sandborn William J
Sanders Jon G
Sanders Karenina
Smarr Larry
Xu Zhenjiang Z
Zhu Qiyun
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

As metagenomic studies move to increasing numbers of samples, communities like the human gut may benefit more from the assembly of abundant microbes in many samples, rather than the exhaustive assembly of fewer samples. We term this approach leaderboard metagenome sequencing. To explore protocol optimization for leaderboard metagenomics in real samples, we introduce a benchmark of library prep and sequencing using internal references generated by synthetic long-read technology, allowing us to evaluate high-throughput library preparation methods against gold-standard reference genomes derived from the samples themselves. We introduce a low-cost protocol for high-throughput library preparation and sequencing

eScholarship - University of California

Identifying accurate metagenome and amplicon software via a meta-analysis of sequence to taxonomy benchmarking studies

Author: Jenny L. Draper
Matthew B. Stott
Paul P. Gardner
Renee J. Watson
Robert D. Finn
Sergio E. Morales
Xochitl C. Morgan
Publication venue: 'PeerJ'
Publication date: 01/01/2019
Field of study

Metagenomic and meta-barcode DNA sequencing has rapidly become a widely-used technique for investigating a range of questions, particularly related to health and environmental monitoring. There has also been a proliferation of bioinformatic tools for analysing metagenomic and amplicon datasets, which makes selecting adequate tools a significant challenge. A number of benchmark studies have been undertaken; however, these can present conflicting results. In order to address this issue we have applied a robust Z-score ranking procedure and a network meta-analysis method to identify software tools that are consistently accurate for mapping DNA sequences to taxonomic hierarchies. Based upon these results we have identified some tools and computational strategies that produce robust predictions

Directory of Open Access Journals

Taxonomic classification of metagenomic sequences

Author: Gerlach Wolfgang
Publication venue: 'Verlag der Technischen Universitat Graz'
Publication date: 01/01/2012
Field of study

Gerlach W. Taxonomic classification of metagenomic sequences. Bielefeld: Universität; 2012.Bacteria, archaea and microeukaryotes can be found in almost every habitat present in nature, in particular in soil, sediments and sea water. They typically live in complex communities with different kinds of symbiotic associations which include relationships with larger organisms like animals or plants. Examples are microbial communities in the gut or on the skin of animals and humans, or bacteria that live in symbiosis with plants. The vast majority of such microbes are unculturable and thus cannot be sequenced by means of traditional methods. The recently upcoming discipline of metagenomics provides various in vivo- and in silico-tools to overcome this limitation. In particular, high-throughput sequencing techniques like 454 or Solexa-Illumina make it possible to explore those microbes by studying whole natural microbial communities and analysing their biological diversity as well as the underlying metabolic pathways. A current limitation of theses technologies is that they can sequence only DNA fragments of a limited length. With this limitation it is usually not possible to recover complete microbial genomes. In addition, the DNA fragments are drawn randomly from the microbial communities and the exact species of origin is unknown. Over the past few years, different methods have been developed for the taxonomic and functional characterization of metagenomic shotgun sequences. However, the taxonomic classification of metagenomic sequences from novel species without close homologues in the biological sequence databases poses a challenge due to the high number of wrong taxonomic predictions on lower taxonomic ranks. In this thesis we present CARMA3, a novel method for the taxonomic classification of assembled and unassembled metagenomic sequences that has been adapted to work with both BLAST and HMMER3 homology searches. CARMA3 accepts protein-encoding DNA sequences, protein sequences, and 16S-rDNA sequences as input. In addition, we present WebCARMA, a web application for the analysis of protein-encoding DNA sequences with CARMA3 without the need for a local installation. We evaluate our novel method in different experiments using simulated and real shotgun metagenomes and show that CARMA3 makes fewer wrong taxonomic predictions (at the same sensitivity) than other BLAST-based methods. In the last experiment we show that also very short reads can, in principle, be used to describe the taxonomic content of a metagenome

Publications at Bielefeld University

Future potential of metagenomics in clinical laboratories

Author: Cassidy Hayley
Couto Natacha
Peker Nilay
Rossen John W A
Schuele Leonard
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2021
Field of study

INTRODUCTION: Rapid and sensitive diagnostic strategies are necessary for patient care and public health. Most of the current conventional microbiological assays detect only a restricted panel of pathogens at a time or require a microbe to be successfully cultured from a sample. Clinical metagenomics next-generation sequencing (mNGS) has the potential to unbiasedly detect all pathogens in a sample, increasing the sensitivity for detection and enabling the discovery of unknown infectious agents. AREAS COVERED: High expectations have been built around mNGS; however, this technique is far from widely available. This review highlights the advances and currently available options in terms of costs, turnaround time, sensitivity, specificity, validation, and reproducibility of mNGS as a diagnostic tool in clinical microbiology laboratories. EXPERT OPINION: The need for a novel diagnostic tool to increase the sensitivity of microbial diagnostics is clear. mNGS has the potential to revolutionise clinical microbiology. However, its role as a diagnostic tool has yet to be widely established, which is crucial for successfully implementing the technique. A clear definition of diagnostic algorithms that include mNGS is vital to show clinical utility. Similarly to real-time PCR, mNGS will one day become a vital tool in any testing algorithm

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen