4 research outputs found
Sample description with microsatellite dataset
Sample description (sheet 1) with microsatellite dataset (genotypes in sheet 2, allelic frequencies in sheet 3, genetic diversity estimates in sheet 3). The locality label is composed of the habitat type (M=marina, R=natural rocky reef, F=cultivated population), the bay number code, and the year of sampling as shown in Fig. 1 in the associated paper. The geographic name of each locality and bay are detailed in Table 1 of the associated paper
RADseq VCF file (Undaria pinnatifida - Brittany)
VCF file with ddRad-seq dataset. This dataset comprises 10,615 single-SNP loci polymorphic in a sample of 735 Undaria pinnatifida sporophytes originating from 36 temporal or spatial samples. Individual codes are described in sheet 1 of the the associated microsatellite dataset (see Excell file)
Data_Sheet_1_Development and validation of a random forest algorithm for source attribution of animal and human Salmonella Typhimurium and monophasic variants of S. Typhimurium isolates in England and Wales utilising whole genome sequencing data.zip
Source attribution has traditionally involved combining epidemiological data with different pathogen characterisation methods, including 7-gene multi locus sequence typing (MLST) or serotyping, however, these approaches have limited resolution. In contrast, whole genome sequencing data provide an overview of the whole genome that can be used by attribution algorithms. Here, we applied a random forest (RF) algorithm to predict the primary sources of human clinical Salmonella Typhimurium (S. Typhimurium) and monophasic variants (monophasic S. Typhimurium) isolates. To this end, we utilised single nucleotide polymorphism diversity in the core genome MLST alleles obtained from 1,061 laboratory-confirmed human and animal S. Typhimurium and monophasic S. Typhimurium isolates as inputs into a RF model. The algorithm was used for supervised learning to classify 399 animal S. Typhimurium and monophasic S. Typhimurium isolates into one of eight distinct primary source classes comprising common livestock and pet animal species: cattle, pigs, sheep, other mammals (pets: mostly dogs and horses), broilers, layers, turkeys, and game birds (pheasants, quail, and pigeons). When applied to the training set animal isolates, model accuracy was 0.929 and kappa 0.905, whereas for the test set animal isolates, for which the primary source class information was withheld from the model, the accuracy was 0.779 and kappa 0.700. Subsequently, the model was applied to assign 662 human clinical cases to the eight primary source classes. In the dataset, 60/399 (15.0%) of the animal and 141/662 (21.3%) of the human isolates were associated with a known outbreak of S. Typhimurium definitive type (DT) 104. All but two of the 141 DT104 outbreak linked human isolates were correctly attributed by the model to the primary source classes identified as the origin of the DT104 outbreak. A model that was run without the clonal DT104 animal isolates produced largely congruent outputs (training set accuracy 0.989 and kappa 0.985; test set accuracy 0.781 and kappa 0.663). Overall, our results show that RF offers considerable promise as a suitable methodology for epidemiological tracking and source attribution for foodborne pathogens.</p
Data_Sheet_1_Geographical and temporal distribution of multidrug-resistant Salmonella Infantis in Europe and the Americas.zip
Recently emerged S. Infantis strains carrying resistance to several commonly used antimicrobials have been reported from different parts of the globe, causing human cases of salmonellosis and with occurrence reported predominantly in broiler chickens. Here, we performed phylogenetic and genetic clustering analyses to describe the population structure of 417 S. Infantis originating from multiple European countries and the Americas collected between 1985 and 2019. Of these, 171 were collected from 56 distinct premises located in England and Wales (E/W) between 2009 and 2019, including isolates linked to incursions of multidrug-resistant (MDR) strains from Europe associated with imported poultry meat. The analysis facilitated the comparison of isolates from different E/W sources with isolates originating from other countries. There was a high degree of congruency between the outputs of different types of population structure analyses revealing that the E/W and central European (Germany, Hungary, and Poland) isolates formed several disparate groups, which were distinct from the cluster relating to the United States (USA) and Ecuador/Peru, but that isolates from Brazil were closely related to the E/W and the central European isolates. Nearly half of the analysed strains/genomes (194/417) harboured the IncFIB(pN55391) replicon typical of the “parasitic” pESI-like megaplasmid found in diverse strains of S. Infantis. The isolates that contained the IncFIB(pN55391) replicon clustered together, despite originating from different parts of the globe. This outcome was corroborated by the time-measured phylogeny, which indicated that the initial acquisition of IncFIB(pN55391) likely occurred in Europe in the late 1980s, with a single introduction of IncFIB(pN55391)-carrying S. Infantis to the Americas several years later. Most of the antimicrobial resistance (AMR) genes were identified in isolates that harboured one or more different plasmids, but based on the short-read assemblies, only a minority of the resistance genes found in these isolates were identified as being associated with the detected plasmids, whereas the hybrid assemblies comprising the short and long reads demonstrated that the majority of the identified AMR genes were associated with IncFIB(pN55391) and other detected plasmid replicon types. This finding underlies the importance of applying appropriate methodologies to investigate associations of AMR genes with bacterial plasmids.</p
