13 research outputs found
Files for taxonomy comparison
Synthetic mock dataset used for taxonomy assignment comparison, across multiple read lengths. File includes scripts used and raw validation data
Files for validation of beta diversity using 16S rDNA regions V6 to V9
Synthetic mock datasets (100 replicates) used for validation of beta diversity across synthetic communities using 16S rDNA region V6 to V9. File also includes scripts and resulting raw data for the validation
Files for validation of beta diversity using 16S rDNA regions V3 to V5
Synthetic mock datasets (100 replicates) used for validation of beta diversity across synthetic communities using 16S rDNA region V3 to V5. File also includes scripts and resulting raw data for the validation
Comparison of alignment tools.
<p>Plot of Gamma Log Likelihoods of 100 trees created from paired reads selected from the Greengenes 13<sub>–</sub>5 database, and aligned using PyNAST and Infernal version 1.1. Likelihoods in the trees created using Infernal are significantly better than the trees created with PyNAST (<i>p</i><0.0001, Wilcoxon signed ranked test), strongly suggesting that Infernal produces better quality alignments than PyNAST for the same input reads.</p
Library comparison in a realistic sample.
<p>Mantel <i>r</i> statistic comparing the unweighted UniFrac matrices for different short-read 16S libraries against a full-length 16S library, based on a real paired-end libary sequenced from stool samples. The paired reads have a higher correlation to the full-length library than any of the other single read libraries.</p><p>Library comparison in a realistic sample.</p
Files for validation with realistic synthetic reads
Synthetic mock datasets used for validation based on a realistic human stool microbiome dataset. These are NOT real bacterial reads. Those reads are available as example data from the IM-TORNADO pipeline. File includes scripts and raw validation data
Taxonomy comparison.
<p>Comparison of accuracy from taxonomy calls using paired, read 1 (R1), read 2 (R2), and full length 16S rDNA. Analyses were carried out using the Ribosomal Database Project taxonomy classifier with the complete Greengenes database. Using Paired read analysis provides more accurate taxonomic classification than either R1 or R2 alone across all taxonomic levels. Full length 16S rDNA was used for comparison purposes.</p><p>Taxonomy comparison.</p
Files for taxonomy comparison using reads with errors
Synthetic mock datasets (100 replicates) used for validation of taxonomy using reads with errors and different length. File also includes scripts and resulting raw data for the validation
Comparison of <i>β</i>-diversity between libraries.
<p>Plot of a Mantel correlation test comparing unweighted UniFrac distance matrices created using synthetic mock communities from paired, R1 and R2 reads (for both the V3–V5 and V6–V9 pairs) versus the distance matrix created from the corresponding full-length 16S synthetic mock communities. A higher correlation value means the distance matrices, and hence their <i>β</i>-diversity, are more closely related to the full-length communities. Here, the communities from paired reads are significantly closer to the full-length communities than the R1 and R2 communities (<i>p</i><0.0001, Wilcoxon signed ranked test, using 100 synthetic mock communities), strongly suggesting that combining the use of paired reads leads to results closer to what is obtained from full-length reads, even when the chosen primers create non-overlapping reads.</p
Files for comparison of sequence aligners
Synthetic mock datasets used to compare the mulitple sequence aligners cmalign (from the infernal package) versus the NAST algorithm (from the PyNAST implementation). File includes R scripts and resulting raw validation data