21 research outputs found
Files for validation of beta diversity using 16S rDNA regions V6 to V9
Synthetic mock datasets (100 replicates) used for validation of beta diversity across synthetic communities using 16S rDNA region V6 to V9. File also includes scripts and resulting raw data for the validation
Files for validation of beta diversity using 16S rDNA regions V3 to V5
Synthetic mock datasets (100 replicates) used for validation of beta diversity across synthetic communities using 16S rDNA region V3 to V5. File also includes scripts and resulting raw data for the validation
Files for taxonomy comparison
Synthetic mock dataset used for taxonomy assignment comparison, across multiple read lengths. File includes scripts used and raw validation data
Additional file 2: Table S1. of An expansion of rare lineage intestinal microbes characterizes rheumatoid arthritis
Association of clinical variables with gut microbial diversity. The association of ι-diversity was assessed by a linear model and the association of β-diversity was assessed by PERMANOVA. Table S2. Differential abundance analysis of phylum, family, and genus-level taxa. Table S3. Summary statistics of differentially abundant taxa at a false discovery rate of 15 %. Table S4. Differential abundance analysis results of major KEGG pathways. Table S5. Importance scores of the genera determined by the random forests algorithm. Table S6. Abundance of metabolites in RA and relatives. Mean and standard deviation are given. (XLSX 36 kb
Library comparison in a realistic sample.
<p>Mantel <i>r</i> statistic comparing the unweighted UniFrac matrices for different short-read 16S libraries against a full-length 16S library, based on a real paired-end libary sequenced from stool samples. The paired reads have a higher correlation to the full-length library than any of the other single read libraries.</p><p>Library comparison in a realistic sample.</p
Files for validation with realistic synthetic reads
Synthetic mock datasets used for validation based on a realistic human stool microbiome dataset. These are NOT real bacterial reads. Those reads are available as example data from the IM-TORNADO pipeline. File includes scripts and raw validation data
Comparison of <i>β</i>-diversity between libraries.
<p>Plot of a Mantel correlation test comparing unweighted UniFrac distance matrices created using synthetic mock communities from paired, R1 and R2 reads (for both the V3–V5 and V6–V9 pairs) versus the distance matrix created from the corresponding full-length 16S synthetic mock communities. A higher correlation value means the distance matrices, and hence their <i>β</i>-diversity, are more closely related to the full-length communities. Here, the communities from paired reads are significantly closer to the full-length communities than the R1 and R2 communities (<i>p</i><0.0001, Wilcoxon signed ranked test, using 100 synthetic mock communities), strongly suggesting that combining the use of paired reads leads to results closer to what is obtained from full-length reads, even when the chosen primers create non-overlapping reads.</p
Files for taxonomy comparison using reads with errors
Synthetic mock datasets (100 replicates) used for validation of taxonomy using reads with errors and different length. File also includes scripts and resulting raw data for the validation
Files for comparison of sequence aligners
Synthetic mock datasets used to compare the mulitple sequence aligners cmalign (from the infernal package) versus the NAST algorithm (from the PyNAST implementation). File includes R scripts and resulting raw validation data
Comparison of phylogenetic trees between libraries.
<p>Plot of a Mantel correlation test comparing cophenetic distance matrices calculated from phylogenetic trees created using paired, R1 and R2 (for both the V3–V5 and V6–V9 primer pairs) versus the distance matrix created from the corresponding full-length 16S trees. A higher correlation value means the trees are more closely related to the full-length trees. Here, the paired trees are significantly closer to the full-length trees than the R1 and R2 trees (<i>p</i><0.0001, Wilcoxon signed ranked test, using 100 synthetic mock communities), strongly suggesting that combining the use of paired reads leads to phylogenies closer to what is obtained from full-length reads, even when the chosen primers create non-overlapping reads.</p