6 research outputs found

    Species Identification and Profiling of Complex Microbial Communities Using Shotgun Illumina Sequencing of 16S rRNA Amplicon Sequences

    Get PDF
    The high throughput and cost-effectiveness afforded by short-read sequencing technologies, in principle, enable researchers to perform 16S rRNA profiling of complex microbial communities at unprecedented depth and resolution. Existing Illumina sequencing protocols are, however, limited by the fraction of the 16S rRNA gene that is interrogated and therefore limit the resolution and quality of the profiling. To address this, we present the design of a novel protocol for shotgun Illumina sequencing of the bacterial 16S rRNA gene, optimized to capture more than 90% of sequences in the Greengenes database and with nearly twice the resolution of existing protocols. Using several in silico and experimental datasets, we demonstrate that despite the presence of multiple variable and conserved regions, the resulting shotgun sequences can be used to accurately quantify the diversity of complex microbial communities. The reconstruction of a significant fraction of the 16S rRNA gene also enabled high precision (>90%) in species-level identification thereby opening up potential application of this approach for clinical microbial characterization.Comment: 17 pages, 2 tables, 2 figures, supplementary materia

    Evaluation of EMIRGE, modQIIME and RTAX on different datasets.

    No full text
    <p>Precision and recall rates for the “Oral”, “Gut”, “Complex” and ABC33 datasets using EMIRGE, modQIIME and RTAX at a 0.1% relative abundance threshold. The percentage of sequences/OTUs removed because of the abundance threshold is given in parentheses for each method.</p

    <i>In silico</i> evaluation of 16S rRNA PCR primers.

    No full text
    <p>A) Percentage of sequences matching individual primers, with the top two primers highlighted in boxes. B) Percentage of sequences amplifiable by various primer pairs (338F*/1061R is the best pair). Percentage of matched sequences is measured against the Greengenes 16S rRNA sequence database. See Table S4 in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0060811#pone.0060811.s001" target="_blank">File S1</a> for primer sequences and results measured against the RDP and SILVA databases. Primer numbering is based on the <i>E. coli</i> system of nomenclature as in Brosius <i>et al</i>. <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0060811#pone.0060811-Brosius1" target="_blank">[37]</a> and for simplicity the same name (say 784F) is used for both forward and reverse primers at a given position.</p

    Community composition based on 16S rRNA sequence reconstruction using EMIRGE.

    No full text
    <p>A) Correlation between known and estimated relative abundances of predicted species on three <i>in silico</i> datasets. A log-scaled version of this plot can be seen in <b>Figure S1</b> in <b><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0060811#pone.0060811.s001" target="_blank">File S1</a></b>. B) Composition at the phylum level for the throat swab and stool sequencing datasets.</p

    Species- and genus-level resolution of various sequencing approaches.

    No full text
    <p>Resolution was measured by the number of OTUs/clusters produced using UCLUST <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0060811#pone.0060811-Edgar1" target="_blank">[21]</a> at the species (97% identity) and genus level (95% identity) for 16S rRNA sequences in the Greengenes database, based on various end-sequencing (76 bases in length from either the 5′ or 3′ end) and shotgun-sequencing approaches and primer combinations. A higher OTU/cluster number indicates a theoretical higher level of resolution for taxonomic classification. The numbers in parenthesis provide the purity of clusters as measured by the percentage of clusters with homogenous taxonomy assignments in Greengenes. Entries with the highest resolution and/or purity for each sequencing approach are marked in bold. The primer sequences can be found in <b>Table S4</b> in <b><a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0060811#pone.0060811.s001" target="_blank">File S1</a></b>.</p
    corecore