Search CORE

2,057 research outputs found

UCHIME improves sensitivity and speed of chimera detection

Author: Clemente Jose C.
Edgar Robert C.
Haas Brian J.
Knight Rob
Quince Christopher
Publication venue: Oxford University Press
Publication date: 23/06/2011
Field of study

Motivation: Chimeric DNA sequences often form during polymerase chain reaction amplification, especially when sequencing single regions (e.g. 16S rRNA or fungal Internal Transcribed Spacer) to assess diversity or compare populations. Undetected chimeras may be misinterpreted as novel species, causing inflated estimates of diversity and spurious inferences of differences between populations. Detection and removal of chimeras is therefore of critical importance in such experiments

Crossref

PubMed Central

eScholarship - University of California

Enlighten

University of East Anglia digital repository

Table 2: Paired-end reads merging performance.

Author: Altschul
Burge
Caporaso
Cock
DeSantis
Eastlake
Edgar
Edgar
Edgar
Edgar
Fowler
Gailly
Gilbert
Gusfield
He
Hirschberg
Hubert
Human Microbiome Project Consortium
Karsenti
Li
Logares
MacCallum
Mahé
Masella
Myers
Needleman
Nichols
Quast
Rand
Rivest
Rockström
Rognes
Schirmer
Schloss
Schloss
Seward
Song
Steffen
Westcott
Zhang
Publication venue: 'PeerJ'
Publication date
Field of study

Crossref

TaxMan : a server to trim rRNA reference databases and inspect taxonomic coverage

Author: Ashelford
B. W. Brandt
Clarridge
Cole
Crielaard
DeSantis
E. Zaura
Griffen
Huse
Huse
M. J. Bonder
Rice
S. M. Huse
Schloss
Schloss
Srinivasan
Zhou
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

© The Author(s), 2012. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Nucleic Acids Research 40 (2012): W82-W87, doi:10.1093/nar/gks418.Amplicon sequencing of the hypervariable regions of the small subunit ribosomal RNA gene is a widely accepted method for identifying the members of complex bacterial communities. Several rRNA gene sequence reference databases can be used to assign taxonomic names to the sequencing reads using BLAST, USEARCH, GAST or the RDP classifier. Next-generation sequencing methods produce ample reads, but they are short, currently ∼100–450 nt (depending on the technology), as compared to the full rRNA gene of ∼1550 nt. It is important, therefore, to select the right rRNA gene region for sequencing. The primers should amplify the species of interest and the hypervariable regions should differentiate their taxonomy. Here, we introduce TaxMan: a web-based tool that trims reference sequences based on user-selected primer pairs and returns an assessment of the primer specificity by taxa. It allows interactive plotting of taxa, both amplified and missed in silico by the primers used. Additionally, using the trimmed sequences improves the speed of sequence matching algorithms. The smaller database greatly improves run times (up to 98%) and memory usage, not only of similarity searching (BLAST), but also of chimera checking (UCHIME) and of clustering the reads (UCLUST). TaxMan is available at http://www.ibi.vu.nl/programs/taxmanwww/.University of Amsterdam under the research priority area ‘Oral Infections and Inflammation’ (to B.W.B.); National Science Foundation [NSF/BDI 0960626 to S.M.H.]; the European Union Seventh Framework Programme (FP7/ 2007-2013) under ANTIRESDEV grant agreement no 241446 (to E.Z.)

Crossref

VU Research Portal

Woods Hole Open Access Server

PubMed Central

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Reducing the Effects of PCR Amplification and Sequencing Artifacts on 16S rRNA-Based Studies

Author: Dirk Gevers
Jack Anthony Gilbert
Patrick D. Schloss
Sarah L. Westcott
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/07/2011
Field of study

The advent of next generation sequencing has coincided with a growth in interest in using these approaches to better understand the role of the structure and function of the microbial communities in human, animal, and environmental health. Yet, use of next generation sequencing to perform 16S rRNA gene sequence surveys has resulted in considerable controversy surrounding the effects of sequencing errors on downstream analyses. We analyzed 2.7×10[superscript 6] reads distributed among 90 identical mock community samples, which were collections of genomic DNA from 21 different species with known 16S rRNA gene sequences; we observed an average error rate of 0.0060. To improve this error rate, we evaluated numerous methods of identifying bad sequence reads, identifying regions within reads of poor quality, and correcting base calls and were able to reduce the overall error rate to 0.0002. Implementation of the PyroNoise algorithm provided the best combination of error rate, sequence length, and number of sequences. Perhaps more problematic than sequencing errors was the presence of chimeras generated during PCR. Because we knew the true sequences within the mock community and the chimeras they could form, we identified 8% of the raw sequence reads as chimeric. After quality filtering the raw sequences and using the Uchime chimera detection program, the overall chimera rate decreased to 1%. The chimeras that could not be detected were largely responsible for the identification of spurious operational taxonomic units (OTUs) and genus-level phylotypes. The number of spurious OTUs and phylotypes increased with sequencing effort indicating that comparison of communities should be made using an equal number of sequences. Finally, we applied our improved quality-filtering pipeline to several benchmarking studies and observed that even with our stringent data curation pipeline, biases in the data generation pipeline and batch effects were observed that could potentially confound the interpretation of microbial community data.National Institutes of Health (U.S.) (1R01HG005975-01)National Science Foundation (U.S.) (award #0743432)National Institutes of Health (U.S.) (grant NIHU54HG004969

Public Library of Science (PLOS)

DSpace@MIT

Crossref

Directory of Open Access Journals

PubMed Central

Reconstruction of Ribosomal RNA Genes from Metagenomic Data

Author: A Brady
A Stamatakis
AC McHardy
B Liu
BJ Haas
C Pedrós-Alió
C Quince
C Quince
C Simon
CS Miller
D McDonald
D Wu
DB Rusch
DH Huson
E Pruesse
Francisco Rodriguez-Valera
G Talavera
H Teeling
I Saeed
J Bengtsson
J Peterson
JC Venter
JF Siqueira
JG Caporaso
K Mavromatis
KE Ashelford
KE Ashelford
KE McElroy
Kerensa McElroy
L Fan
Lu Fan
M Liu
M Margulies
M Stark
M Wu
ML Sogin
MW Taylor
MW Taylor
MW Taylor
NR Pace
PD Schloss
PD Schloss
PDA Schloss Gevers
PY Yung
Q Wang
R Radax
R Schmieder
RC Edgar
S Hong
S Schmitt
SE Dowd
SG Tringe
T Huber
T Thomas
T Thomas
TJ Sharpton
Torsten Thomas
Publication venue: Public Library of Science
Publication date: 01/06/2012
Field of study

Direct sequencing of environmental DNA (metagenomics) has a great potential for describing the 16S rRNA gene diversity of microbial communities. However current approaches using this 16S rRNA gene information to describe community diversity suffer from low taxonomic resolution or chimera problems. Here we describe a new strategy that involves stringent assembly and data filtering to reconstruct full-length 16S rRNA genes from metagenomicpyrosequencing data. Simulations showed that reconstructed 16S rRNA genes provided a true picture of the community diversity, had minimal rates of chimera formation and gave taxonomic resolution down to genus level. The strategy was furthermore compared to PCR-based methods to determine the microbial diversity in two marine sponges. This showed that about 30% of the abundant phylotypes reconstructed from metagenomic data failed to be amplified by PCR. Our approach is readily applicable to existing metagenomic datasets and is expected to lead to the discovery of new microbial phylotypes

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Queensland eSpace

FigShare

Bacterial diversity assessment in Antarctic terrestrial and aquatic microbial mats : a comparison between bidirectional pyrosequencing and cultivation

Author: D'hondt Sofie
De Meyer Tim
De Wever Aaike
Obbels Dagmar
Peeters Karolien
Tytgat Bjorn
Van Criekinge Wim
Verleyen Elie
Vyverman Wim
Willems Anne
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

The application of high-throughput sequencing of the 16S rRNA gene has increased the size of microbial diversity datasets by several orders of magnitude, providing improved access to the rare biosphere compared with cultivation-based approaches and more established cultivation-independent techniques. By contrast, cultivation-based approaches allow the retrieval of both common and uncommon bacteria that can grow in the conditions used and provide access to strains for biotechnological applications. We performed bidirectional pyrosequencing of the bacterial 16S rRNA gene diversity in two terrestrial and seven aquatic Antarctic microbial mat samples previously studied by heterotrophic cultivation. While, not unexpectedly, 77.5% of genera recovered by pyrosequencing were not among the isolates, 25.6% of the genera picked up by cultivation were not detected by pyrosequencing. To allow comparison between both techniques, we focused on the five phyla (Proteobacteria, Actinobacteria, Bacteroidetes, Firmicutes and Deinococcus-Thermus) recovered by heterotrophic cultivation. Four of these phyla were among the most abundantly recovered by pyrosequencing. Strikingly, there was relatively little overlap between cultivation and the forward and reverse pyrosequencing-based datasets at the genus (17.1–22.2%) and OTU (3.5–3.6%) level (defined on a 97% similarity cut-off level). Comparison of the V1–V2 and V3–V2 datasets of the 16S rRNA gene revealed remarkable differences in number of OTUs and genera recovered. The forward dataset missed 33% of the genera from the reverse dataset despite comprising 50% more OTUs, while the reverse dataset did not contain 40% of the genera of the forward dataset. Similar observations were evident when comparing the forward and reverse cultivation datasets. Our results indicate that the region under consideration can have a large impact on perceived diversity, and should be considered when comparing different datasets. Finally, a high number of OTUs could not be classified using the RDP reference database, suggesting the presence of a large amount of novel diversity

Ghent University Academic Bibliography

Directory of Open Access Journals

PubMed Central

Open Marine Archive

FigShare

Analytical Tools and Databases for Metagenomics in the Next-Generation Sequencing Era

Author: 김민철
김봉수
윤석환
이기현
이하나
천종식
Publication venue: 'Korea Genome Organization'
Publication date: 01/09/2013
Field of study

Metagenomics has become one of the indispensable tools in microbial ecology for the last few decades, and a new revolution in metagenomic studies is now about to begin, with the help of recent advances of sequencing techniques. The massive data production and substantial cost reduction in next-generation sequencing have led to the rapid growth of metagenomic research both quantitatively and qualitatively. It is evident that metagenomics will be a standard tool for studying the diversity and function of microbes in the near future, as fingerprinting methods did previously. As the speed of data accumulation is accelerating, bioinformatic tools and associated databases for handling those datasets have become more urgent and necessary. To facilitate the bioinformatics analysis of metagenomic data, we review some recent tools and databases that are used widely in this field and give insights into the current challenges and future of metagenomics from a bioinformatics perspective.

SNU Open Repository and Archive

Interpreting 16S metagenomic data without clustering to achieve sub-OTU resolution

Author: A Klindworth
A Shade
A Shade
AM Eren
BJ Haas
C Huttenhower
C Lozupone
C Quince
C Quince
DE Hunt
DN Fredricks
EK Costello
EK Costello
H Ochman
JG Caporaso
JG Caporaso
JI Prosser
JJ Faith
JL VandeWalle
JR Brestoff
M Hamady
MGI Langille
Mikhail Tikhonov
MJ Morgan
MJ Rosen
N Fierer
N Kamada
ND Youngblut
Ned S Wingreen
O Lukjancenko
PD Schloss
PD Schloss
PD Schloss
PJ Turnbaugh
RC Edgar
RC Edgar
RC Edgar
Robert W Leach
SJ Song
SM Huse
SP Preheim
TP Tourova
V Kunin
WJ Sul
Y Huang
ZJ Zheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/07/2014
Field of study

The standard approach to analyzing 16S tag sequence data, which relies on clustering reads by sequence similarity into Operational Taxonomic Units (OTUs), underexploits the accuracy of modern sequencing technology. We present a clustering-free approach to multi-sample Illumina datasets that can identify independent bacterial subpopulations regardless of the similarity of their 16S tag sequences. Using published data from a longitudinal time-series study of human tongue microbiota, we are able to resolve within standard 97% similarity OTUs up to 20 distinct subpopulations, all ecologically distinct but with 16S tags differing by as little as 1 nucleotide (99.2% similarity). A comparative analysis of oral communities of two cohabiting individuals reveals that most such subpopulations are shared between the two communities at 100% sequence identity, and that dynamical similarity between subpopulations in one host is strongly predictive of dynamical similarity between the same subpopulations in the other host. Our method can also be applied to samples collected in cross-sectional studies and can be used with the 454 sequencing platform. We discuss how the sub-OTU resolution of our approach can provide new insight into factors shaping community assembly.Comment: Updated to match the published version. 12 pages, 5 figures + supplement. Significantly revised for clarity, references added, results not change

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

PubMed Central