19 research outputs found

    Comparative analysis of subtyping methods against a whole-genome-sequencing standard for Salmonella enterica serotype Enteritidis.

    Get PDF
    A retrospective investigation was performed to evaluate whole-genome sequencing as a benchmark for comparing molecular subtyping methods for Salmonella enterica serotype Enteritidis and survey the population structure of commonly encountered S. enterica serotype Enteritidis outbreak isolates in the United States. A total of 52 S. enterica serotype Enteritidis isolates representing 16 major outbreaks and three sporadic cases collected between 2001 and 2012 were sequenced and subjected to subtyping by four different methods: (i) whole-genome single-nucleotide-polymorphism typing (WGST), (ii) multiple-locus variable-number tandem-repeat (VNTR) analysis (MLVA), (iii) clustered regularly interspaced short palindromic repeats combined with multi-virulence-locus sequence typing (CRISPR-MVLST), and (iv) pulsed-field gel electrophoresis (PFGE). WGST resolved all outbreak clusters and provided useful robust phylogenetic inference results with high epidemiological correlation. While both MLVA and CRISPR-MVLST yielded higher discriminatory power than PFGE, MLVA outperformed the other methods in delineating outbreak clusters whereas CRISPR-MVLST showed the potential to trace major lineages and ecological origins of S. enterica serotype Enteritidis. Our results suggested that whole-genome sequencing makes a viable platform for the evaluation and benchmarking of molecular subtyping methods

    Evaluation of whole and core genome multilocus sequence typing allele schemes for Salmonella enterica outbreak detection in a national surveillance network, PulseNet USA

    Get PDF
    Salmonella enterica is a leading cause of bacterial foodborne and zoonotic illnesses in the United States. For this study, we applied four different whole genome sequencing (WGS)-based subtyping methods: high quality single-nucleotide polymorphism (hqSNP) analysis, whole genome multilocus sequence typing using either all loci [wgMLST (all loci)] and only chromosome-associated loci [wgMLST (chrom)], and core genome multilocus sequence typing (cgMLST) to a dataset of isolate sequences from 9 well-characterized Salmonella outbreaks. For each outbreak, we evaluated the genomic and epidemiologic concordance between hqSNP and allele-based methods. We first compared pairwise genomic differences using all four methods. We observed discrepancies in allele difference ranges when using wgMLST (all loci), likely caused by inflated genetic variation due to loci found on plasmids and/or other mobile genetic elements in the accessory genome. Therefore, we excluded wgMLST (all loci) results from any further comparisons in the study. Then, we created linear regression models and phylogenetic tanglegrams using the remaining three methods. K-means analysis using the silhouette method was applied to compare the ability of the three methods to partition outbreak and sporadic isolate sequences. Our results showed that pairwise hqSNP differences had high concordance with cgMLST and wgMLST (chrom) allele differences. The slopes of the regressions for hqSNP vs. allele pairwise differences were 0.58 (cgMLST) and 0.74 [wgMLST (chrom)], and the slope of the regression was 0.77 for cgMLST vs. wgMLST (chrom) pairwise differences. Tanglegrams showed high clustering concordance between methods using two statistical measures, the Baker’s gamma index (BGI) and cophenetic correlation coefficient (CCC), where 9/9 (100%) of outbreaks yielded BGI values ≥ 0.60 and CCCs were ≥ 0.97 across all nine outbreaks and all three methods. K-means analysis showed separation of outbreak and sporadic isolate groups with average silhouette widths ≥ 0.87 for outbreak groups and ≥ 0.16 for sporadic groups. This study demonstrates that Salmonella isolates clustered in concordance with epidemiologic data using three WGS-based subtyping methods and supports using cgMLST as the primary method for national surveillance of Salmonella outbreak clusters

    Epidemiology of Salmonella enterica Serotype Dublin Infections among Humans, United States, 1968–2013

    No full text
    Salmonella enterica serotype Dublin is a cattle-adapted bacterium that typically causes bloodstream infections in humans. To summarize demographic, clinical, and antimicrobial drug resistance characteristics of human infections with this organism in the United States, we analyzed data for 1968–2013 from 5 US surveillance systems. During this period, the incidence rate for infection with Salmonella Dublin increased more than that for infection with other Salmonella. Data from 1 system (FoodNet) showed that a higher percentage of persons with Salmonella Dublin infection were hospitalized and died during 2005−2013 (78% hospitalized, 4.2% died) than during 1996–2004 (68% hospitalized, 2.7% died). Susceptibility data showed that a higher percentage of isolates were resistant to >7 classes of antimicrobial drugs during 2005–2013 (50.8%) than during 1996–2004 (2.4%)

    Table_6_Evaluation of whole and core genome multilocus sequence typing allele schemes for Salmonella enterica outbreak detection in a national surveillance network, PulseNet USA.XLSX

    No full text
    Salmonella enterica is a leading cause of bacterial foodborne and zoonotic illnesses in the United States. For this study, we applied four different whole genome sequencing (WGS)-based subtyping methods: high quality single-nucleotide polymorphism (hqSNP) analysis, whole genome multilocus sequence typing using either all loci [wgMLST (all loci)] and only chromosome-associated loci [wgMLST (chrom)], and core genome multilocus sequence typing (cgMLST) to a dataset of isolate sequences from 9 well-characterized Salmonella outbreaks. For each outbreak, we evaluated the genomic and epidemiologic concordance between hqSNP and allele-based methods. We first compared pairwise genomic differences using all four methods. We observed discrepancies in allele difference ranges when using wgMLST (all loci), likely caused by inflated genetic variation due to loci found on plasmids and/or other mobile genetic elements in the accessory genome. Therefore, we excluded wgMLST (all loci) results from any further comparisons in the study. Then, we created linear regression models and phylogenetic tanglegrams using the remaining three methods. K-means analysis using the silhouette method was applied to compare the ability of the three methods to partition outbreak and sporadic isolate sequences. Our results showed that pairwise hqSNP differences had high concordance with cgMLST and wgMLST (chrom) allele differences. The slopes of the regressions for hqSNP vs. allele pairwise differences were 0.58 (cgMLST) and 0.74 [wgMLST (chrom)], and the slope of the regression was 0.77 for cgMLST vs. wgMLST (chrom) pairwise differences. Tanglegrams showed high clustering concordance between methods using two statistical measures, the Baker’s gamma index (BGI) and cophenetic correlation coefficient (CCC), where 9/9 (100%) of outbreaks yielded BGI values ≥ 0.60 and CCCs were ≥ 0.97 across all nine outbreaks and all three methods. K-means analysis showed separation of outbreak and sporadic isolate groups with average silhouette widths ≥ 0.87 for outbreak groups and ≥ 0.16 for sporadic groups. This study demonstrates that Salmonella isolates clustered in concordance with epidemiologic data using three WGS-based subtyping methods and supports using cgMLST as the primary method for national surveillance of Salmonella outbreak clusters.</p

    Data_Sheet_1_Evaluation of whole and core genome multilocus sequence typing allele schemes for Salmonella enterica outbreak detection in a national surveillance network, PulseNet USA.ZIP

    No full text
    Salmonella enterica is a leading cause of bacterial foodborne and zoonotic illnesses in the United States. For this study, we applied four different whole genome sequencing (WGS)-based subtyping methods: high quality single-nucleotide polymorphism (hqSNP) analysis, whole genome multilocus sequence typing using either all loci [wgMLST (all loci)] and only chromosome-associated loci [wgMLST (chrom)], and core genome multilocus sequence typing (cgMLST) to a dataset of isolate sequences from 9 well-characterized Salmonella outbreaks. For each outbreak, we evaluated the genomic and epidemiologic concordance between hqSNP and allele-based methods. We first compared pairwise genomic differences using all four methods. We observed discrepancies in allele difference ranges when using wgMLST (all loci), likely caused by inflated genetic variation due to loci found on plasmids and/or other mobile genetic elements in the accessory genome. Therefore, we excluded wgMLST (all loci) results from any further comparisons in the study. Then, we created linear regression models and phylogenetic tanglegrams using the remaining three methods. K-means analysis using the silhouette method was applied to compare the ability of the three methods to partition outbreak and sporadic isolate sequences. Our results showed that pairwise hqSNP differences had high concordance with cgMLST and wgMLST (chrom) allele differences. The slopes of the regressions for hqSNP vs. allele pairwise differences were 0.58 (cgMLST) and 0.74 [wgMLST (chrom)], and the slope of the regression was 0.77 for cgMLST vs. wgMLST (chrom) pairwise differences. Tanglegrams showed high clustering concordance between methods using two statistical measures, the Baker’s gamma index (BGI) and cophenetic correlation coefficient (CCC), where 9/9 (100%) of outbreaks yielded BGI values ≥ 0.60 and CCCs were ≥ 0.97 across all nine outbreaks and all three methods. K-means analysis showed separation of outbreak and sporadic isolate groups with average silhouette widths ≥ 0.87 for outbreak groups and ≥ 0.16 for sporadic groups. This study demonstrates that Salmonella isolates clustered in concordance with epidemiologic data using three WGS-based subtyping methods and supports using cgMLST as the primary method for national surveillance of Salmonella outbreak clusters.</p

    Table_4_Evaluation of whole and core genome multilocus sequence typing allele schemes for Salmonella enterica outbreak detection in a national surveillance network, PulseNet USA.XLSX

    No full text
    Salmonella enterica is a leading cause of bacterial foodborne and zoonotic illnesses in the United States. For this study, we applied four different whole genome sequencing (WGS)-based subtyping methods: high quality single-nucleotide polymorphism (hqSNP) analysis, whole genome multilocus sequence typing using either all loci [wgMLST (all loci)] and only chromosome-associated loci [wgMLST (chrom)], and core genome multilocus sequence typing (cgMLST) to a dataset of isolate sequences from 9 well-characterized Salmonella outbreaks. For each outbreak, we evaluated the genomic and epidemiologic concordance between hqSNP and allele-based methods. We first compared pairwise genomic differences using all four methods. We observed discrepancies in allele difference ranges when using wgMLST (all loci), likely caused by inflated genetic variation due to loci found on plasmids and/or other mobile genetic elements in the accessory genome. Therefore, we excluded wgMLST (all loci) results from any further comparisons in the study. Then, we created linear regression models and phylogenetic tanglegrams using the remaining three methods. K-means analysis using the silhouette method was applied to compare the ability of the three methods to partition outbreak and sporadic isolate sequences. Our results showed that pairwise hqSNP differences had high concordance with cgMLST and wgMLST (chrom) allele differences. The slopes of the regressions for hqSNP vs. allele pairwise differences were 0.58 (cgMLST) and 0.74 [wgMLST (chrom)], and the slope of the regression was 0.77 for cgMLST vs. wgMLST (chrom) pairwise differences. Tanglegrams showed high clustering concordance between methods using two statistical measures, the Baker’s gamma index (BGI) and cophenetic correlation coefficient (CCC), where 9/9 (100%) of outbreaks yielded BGI values ≥ 0.60 and CCCs were ≥ 0.97 across all nine outbreaks and all three methods. K-means analysis showed separation of outbreak and sporadic isolate groups with average silhouette widths ≥ 0.87 for outbreak groups and ≥ 0.16 for sporadic groups. This study demonstrates that Salmonella isolates clustered in concordance with epidemiologic data using three WGS-based subtyping methods and supports using cgMLST as the primary method for national surveillance of Salmonella outbreak clusters.</p

    Table_3_Evaluation of whole and core genome multilocus sequence typing allele schemes for Salmonella enterica outbreak detection in a national surveillance network, PulseNet USA.XLSX

    No full text
    Salmonella enterica is a leading cause of bacterial foodborne and zoonotic illnesses in the United States. For this study, we applied four different whole genome sequencing (WGS)-based subtyping methods: high quality single-nucleotide polymorphism (hqSNP) analysis, whole genome multilocus sequence typing using either all loci [wgMLST (all loci)] and only chromosome-associated loci [wgMLST (chrom)], and core genome multilocus sequence typing (cgMLST) to a dataset of isolate sequences from 9 well-characterized Salmonella outbreaks. For each outbreak, we evaluated the genomic and epidemiologic concordance between hqSNP and allele-based methods. We first compared pairwise genomic differences using all four methods. We observed discrepancies in allele difference ranges when using wgMLST (all loci), likely caused by inflated genetic variation due to loci found on plasmids and/or other mobile genetic elements in the accessory genome. Therefore, we excluded wgMLST (all loci) results from any further comparisons in the study. Then, we created linear regression models and phylogenetic tanglegrams using the remaining three methods. K-means analysis using the silhouette method was applied to compare the ability of the three methods to partition outbreak and sporadic isolate sequences. Our results showed that pairwise hqSNP differences had high concordance with cgMLST and wgMLST (chrom) allele differences. The slopes of the regressions for hqSNP vs. allele pairwise differences were 0.58 (cgMLST) and 0.74 [wgMLST (chrom)], and the slope of the regression was 0.77 for cgMLST vs. wgMLST (chrom) pairwise differences. Tanglegrams showed high clustering concordance between methods using two statistical measures, the Baker’s gamma index (BGI) and cophenetic correlation coefficient (CCC), where 9/9 (100%) of outbreaks yielded BGI values ≥ 0.60 and CCCs were ≥ 0.97 across all nine outbreaks and all three methods. K-means analysis showed separation of outbreak and sporadic isolate groups with average silhouette widths ≥ 0.87 for outbreak groups and ≥ 0.16 for sporadic groups. This study demonstrates that Salmonella isolates clustered in concordance with epidemiologic data using three WGS-based subtyping methods and supports using cgMLST as the primary method for national surveillance of Salmonella outbreak clusters.</p

    Table_2_Evaluation of whole and core genome multilocus sequence typing allele schemes for Salmonella enterica outbreak detection in a national surveillance network, PulseNet USA.XLSX

    No full text
    Salmonella enterica is a leading cause of bacterial foodborne and zoonotic illnesses in the United States. For this study, we applied four different whole genome sequencing (WGS)-based subtyping methods: high quality single-nucleotide polymorphism (hqSNP) analysis, whole genome multilocus sequence typing using either all loci [wgMLST (all loci)] and only chromosome-associated loci [wgMLST (chrom)], and core genome multilocus sequence typing (cgMLST) to a dataset of isolate sequences from 9 well-characterized Salmonella outbreaks. For each outbreak, we evaluated the genomic and epidemiologic concordance between hqSNP and allele-based methods. We first compared pairwise genomic differences using all four methods. We observed discrepancies in allele difference ranges when using wgMLST (all loci), likely caused by inflated genetic variation due to loci found on plasmids and/or other mobile genetic elements in the accessory genome. Therefore, we excluded wgMLST (all loci) results from any further comparisons in the study. Then, we created linear regression models and phylogenetic tanglegrams using the remaining three methods. K-means analysis using the silhouette method was applied to compare the ability of the three methods to partition outbreak and sporadic isolate sequences. Our results showed that pairwise hqSNP differences had high concordance with cgMLST and wgMLST (chrom) allele differences. The slopes of the regressions for hqSNP vs. allele pairwise differences were 0.58 (cgMLST) and 0.74 [wgMLST (chrom)], and the slope of the regression was 0.77 for cgMLST vs. wgMLST (chrom) pairwise differences. Tanglegrams showed high clustering concordance between methods using two statistical measures, the Baker’s gamma index (BGI) and cophenetic correlation coefficient (CCC), where 9/9 (100%) of outbreaks yielded BGI values ≥ 0.60 and CCCs were ≥ 0.97 across all nine outbreaks and all three methods. K-means analysis showed separation of outbreak and sporadic isolate groups with average silhouette widths ≥ 0.87 for outbreak groups and ≥ 0.16 for sporadic groups. This study demonstrates that Salmonella isolates clustered in concordance with epidemiologic data using three WGS-based subtyping methods and supports using cgMLST as the primary method for national surveillance of Salmonella outbreak clusters.</p
    corecore