14 research outputs found

    Patchy promiscuity:machine learning applied to predict the host specificity of <i>Salmonella enterica </i>and <i>Escherichia coli</i>

    Get PDF
    Supporting data for Patchy promiscuity: machine learning applied to predict the host specificity of Salmonella enterica and Escherichia coli, as published in <em>Microbial Genomics</em

    The advantage of intergenic regions as genomic features for machine-learning-based host attribution of Salmonella Typhimurium from the USA

    Get PDF
    Salmonella enterica is a taxonomically diverse pathogen with over 2600 serovars associated with a wide variety of animal hosts including humans, other mammals, birds and reptiles. Some serovars are host-specific or host-restricted and cause disease in distinct host species, while others, such as serovar S. Typhimurium (STm), are generalists and have the potential to colonize a wide variety of species. However, even within generalist serovars such as STm it is becoming clear that pathovariants exist that differ in tropism and virulence. Identifying the genetic factors underlying host specificity is complex, but the availability of thousands of genome sequences and advances in machine learning have made it possible to build specific host prediction models to aid outbreak control and predict the human pathogenic potential of isolates from animals and other reservoirs. We have advanced this area by building host-association prediction models trained on a wide range of genomic features and compared them with predictions based on nearest-neighbour phylogeny. SNPs, protein variants (PVs), antimicrobial resistance (AMR) profiles and intergenic regions (IGRs) were extracted from 3883 high-quality STm assemblies collected from humans, swine, bovine and poultry in the USA, and used to construct Random Forest (RF) machine learning models. An additional 244 recent STm assemblies from farm animals were used as a test set for further validation. The models based on PVs and IGRs had the best performance in terms of predicting the host of origin of isolates and outperformed nearest-neighbour phylogenetic host prediction as well as models based on SNPs or AMR data. However, the models did not yield reliable predictions when tested with isolates that were phylogenetically distinct from the training set. The IGR and PV models were often able to differentiate human isolates in clusters where the majority of isolates were from a single animal source. Notably, IGRs were the feature with the best performance across multiple models which may be due to IGRs acting as both a representation of their flanking genes, equivalent to PVs, while also capturing genomic regulatory variation, such as altered promoter regions. The IGR and PV models predict that ~45 % of the human infections with STm in the USA originate from bovine, ~40 % from poultry and ~14.5 % from swine, although sequences of isolates from other sources were not used for training. In summary, the research demonstrates a significant gain in accuracy for models with IGRs and PVs as features compared to SNP-based and core genome phylogeny predictions when applied within the existing population structure. This article contains data hosted by Microreact

    Acquisition and loss of CTX-M plasmids in Shigella species associated with MSM transmission in the UK

    Get PDF
    Shigellosis in men who have sex with men (MSM) is caused by multidrug resistant Shigellae, exhibiting resistance to antimicrobials including azithromycin, ciprofloxacin and more recently the third-generation cephalosporins. We sequenced four bla (CTX-M-27)-positive MSM Shigella isolates (2018–20) using Oxford Nanopore Technologies; three S. sonnei (identified as two MSM clade 2, one MSM clade 5) and one S. flexneri 3a, to explore AMR context. All S. sonnei isolates harboured Tn7/Int2 chromosomal integrons, whereas S. flexneri 3a contained the Shigella Resistance Locus. All strains harboured IncFII pKSR100-like plasmids (67-83kbp); where present bla (CTX-M-27) was located on these plasmids flanked by IS26 and IS903B, however bla (CTX-M-27) was lost in S. flexneri 3a during storage between Illumina and Nanopore sequencing. IncFII AMR regions were mosaic and likely reorganised by IS26; three of the four plasmids contained azithromycin-resistance genes erm(B) and mph(A) and one harboured the pKSR100 integron. Additionally, all S. sonnei isolates possessed a large IncB/O/K/Z plasmid, two of which carried aph(3’)-Ib/aph(6)-Id/sul2 and tet(A). Monitoring the transmission of mobile genetic elements with co-located AMR determinants is necessary to inform empirical treatment guidance and clinical management of MSM-associated shigellosis

    Enteroaggregative escherichia coli have evolved independently as distinct complexes within the E. Coli population with varying ability to cause disease

    Get PDF
    Enteroaggregative E. Coli (EAEC) is an established diarrhoeagenic pathotype. The association with virulence gene content and ability to cause disease has been studied but little is known about the population structure of EAEC and how this pathotype evolved. Analysis by Multi Locus Sequence Typing of 564 EAEC isolates from cases and controls in Bangladesh, Nigeria and the UK spanning the past 29 years, revealed multiple successful lineages of EAEC. The population structure of EAEC indicates some clusters are statistically associated with disease or carriage, further highlighting the heterogeneous nature of this group of organisms. Different clusters have evolved independently as a result of both mutational and recombination events; the EAEC phenotype is distributed throughout the population of E. coli

    MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island

    Get PDF
    Short-read, high-throughput sequencing technology cannot identify the chromosomal position of repetitive insertion sequences that typically flank horizontally acquired genes such as bacterial virulence genes and antibiotic resistance genes. The MinION nanopore sequencer can produce long sequencing reads on a device similar in size to a USB memory stick. Here we apply a MinION sequencer to resolve the structure and chromosomal insertion site of a composite antibiotic resistance island in Salmonella Typhi Haplotype 58. Nanopore sequencing data from a single 18-h run was used to create a scaffold for an assembly generated from short-read Illumina data. Our results demonstrate the potential of the MinION device in clinical laboratories to fully characterize the epidemic spread of bacterial pathogens

    Analysis of whole genome sequencing for the Escherichia coli O157:H7 typing phages

    Get PDF
    Background: Shiga toxin producing Escherichia coli O157 can cause severe bloody diarrhea and haemolytic uraemic syndrome. Phage typing of E. coli O157 facilitates public health surveillance and outbreak investigations, certain phage types are more likely to occupy specific niches and are associated with specific age groups and disease severity. The aim of this study was to analyse the genome sequences of 16 (fourteen T4 and two T7) E. coli O157 typing phages and to determine the genes responsible for the subtle differences in phage type profiles. Results: The typing phages were sequenced using paired-end Illumina sequencing at The Genome Analysis Centre and the Animal Health and Veterinary Laboratories Agency and bioinformatics programs including Velvet, Brig and Easyfig were used to analyse them. A two-way Euclidian cluster analysis highlighted the associations between groups of phage types and typing phages. The analysis showed that the T7 typing phages (9 and 10) differed by only three genes and that the T4 typing phages formed three distinct groups of similar genomic sequences: Group 1 (1, 8, 11, 12 and 15, 16), Group 2 (3, 6, 7 and 13) and Group 3 (2, 4, 5 and 14). The E. coli O157 phage typing scheme exhibited a significantly modular network linked to the genetic similarity of each group showing that these groups are specialised to infect a subset of phage types. Conclusion: Sequencing the typing phage has enabled us to identify the variable genes within each group and to determine how this corresponds to changes in phage type.Public Health EnglandNational Institute for Health Research scientific research development fundBiotechnology and Biological Sciences Research Council (BBSRC

    Acquisition and loss of CTX-M plasmids in Shigella species associated with MSM transmission in the UK

    Get PDF
    Shigellosis in men who have sex with men (MSM) is caused by multidrug resistant Shigellae, exhibiting resistance to antimicrobials including azithromycin, ciprofloxacin and more recently the third-generation cephalosporins. We sequenced four blaCTX-M-27-positive MSM Shigella isolates (2018–20) using Oxford Nanopore Technologies; three S. sonnei (identified as two MSM clade 2, one MSM clade 5) and one S. flexneri 3a, to explore AMR context. All S. sonnei isolates harboured Tn7/Int2 chromosomal integrons, whereas S. flexneri 3a contained the Shigella Resistance Locus. All strains harboured IncFII pKSR100-like plasmids (67-83kbp); where present blaCTX-M-27 was located on these plasmids flanked by IS26 and IS903B, however blaCTX-M-27 was lost in S. flexneri 3a during storage between Illumina and Nanopore sequencing. IncFII AMR regions were mosaic and likely reorganised by IS26; three of the four plasmids contained azithromycin-resistance genes erm(B) and mph(A) and one harboured the pKSR100 integron. Additionally, all S. sonnei isolates possessed a large IncB/O/K/Z plasmid, two of which carried aph(3’)-Ib/aph(6)-Id/sul2 and tet(A). Monitoring the transmission of mobile genetic elements with co-located AMR determinants is necessary to inform empirical treatment guidance and clinical management of MSM-associated shigellosis

    Detection of the plasmid-mediated mcr-1 gene conferring colistin resistance in human and food isolates of Salmonella enterica and Escherichia coli in England and Wales.

    No full text
    OBJECTIVES: In response to the first report of transmissible colistin resistance mediated by the mcr-1 gene in Escherichia coli and Klebsiella spp. from animals and humans in China, we sought to determine its presence in Enterobacteriaceae isolated in the UK. METHODS: The PHE archive of whole-genome sequences of isolates from surveillance collections, submissions to reference services and research projects was retrospectively analysed for the presence of mcr-1 using Genefinder. The genetic environment of the gene was also analysed. RESULTS: Rapid screening of the genomes of ∼24 000 Salmonella enterica, E. coli, Klebsiella spp., Enterobacter spp., Campylobacter spp. and Shigella spp. isolated from food or humans identified 15 mcr-1-positive isolates. These comprised: 10 human S. enterica isolates submitted between 2012 and 2015 (8 Salmonella Typhimurium, 1 Salmonella Paratyphi B var Java and 1 Salmonella Virchow) from 10 patients; 3 isolates of E. coli from 2 patients; and 2 isolates of Salmonella Paratyphi B var Java from poultry meat imported from the EU. The mcr-1 gene was located on diverse plasmids belonging to the IncHI2, IncI2 and IncX4 replicon types and its association with ISApl1 varied. Six mcr-1-positive S. enterica isolates were from patients who had recently travelled to Asia. CONCLUSIONS: Analysis of WGS data allowed rapid confirmation of the presence of the plasmid-mediated colistin resistance gene mcr-1 in diverse genetic environments and plasmids. It has been present in E. coli and Salmonella spp. harboured by humans in England and Wales since at least 2012
    corecore