19 research outputs found
Additional file 1: of MICRA: an automatic pipeline for fast characterization of microbial genomes from high-throughput sequencing data
Supplementary figures, notes, and tables. (PDF 3297 kb
Comparison of F-measures between the 200(V3) and 400(V4-V5) amplicon at the family level (left) and at the genus level (right) on the HC 50k dataset with error simulation.
<p>Comparison of F-measures between the 200(V3) and 400(V4-V5) amplicon at the family level (left) and at the genus level (right) on the HC 50k dataset with error simulation.</p
Schematic overview of the evaluation protocol.
<p>Schematic overview of the evaluation protocol.</p
Distinctions between clustering-first and assignment-first approaches.
<p>A question mark indicates an unclassified read and/or taxon.</p
Chao1 values before taxonomic merging for clustering-first pipelines, and at the family level after taxonomic merging for all pipelines, at three different complexities, on the 50k 200(V3) with error simulation datasets.
<p>LC, MC and HC were all composed of 50 bacterial families, in varying proportions.</p
Comparison of the richness (Chao1) and diversity (Inverse Simpson) indexes for clustering-first pipelines before taxonomic merging, on the 200(V3) HC dataset with sequencing errors simulation when generating 25k, 50k and 100k sequences.
<p>Comparison of the richness (Chao1) and diversity (Inverse Simpson) indexes for clustering-first pipelines before taxonomic merging, on the 200(V3) HC dataset with sequencing errors simulation when generating 25k, 50k and 100k sequences.</p
Proportions of the top 10 families per pipeline on the LC, MC and HC 50k 200(V3) with error simulation datasets, and their matching 1-NID clustering indexes (computed after taxonomic merging) at the genus and family levels.
<p>Proportions of the top 10 families per pipeline on the LC, MC and HC 50k 200(V3) with error simulation datasets, and their matching 1-NID clustering indexes (computed after taxonomic merging) at the genus and family levels.</p
F-measure and richness index error percentage after taxonomic merging for each pipeline on the 200(V3) 50k HC dataset with error simulation, when using different databases (the recommended database for each pipeline is marked with *).
<p>F-measure and richness index error percentage after taxonomic merging for each pipeline on the 200(V3) 50k HC dataset with error simulation, when using different databases (the recommended database for each pipeline is marked with *).</p
Comparison of F-measures (top) and richness error (bottom) in the error-free and error-prone sequencing models on the 200(V3) HC 50k dataset.
<p>Comparison of F-measures (top) and richness error (bottom) in the error-free and error-prone sequencing models on the 200(V3) HC 50k dataset.</p
Proportions of the top 10 families per pipeline on a real dataset, and their matching Chao1 diversity indexes (computed after taxonomic merging) at the family level.
<p>Below, average linkage hierarchical clustering of all pipelines based on a Euclidean distance calculation on the amount on all reads per family per pipeline (excluding unclassified reads). Pipelines are marked with a * when executed with their default database.</p