182 research outputs found
The EnteroBase user's guide, with case studies on Salmonella transmissions, Yersinia pestis phylogeny and Escherichia core genomic diversity
EnteroBase is an integrated software environment which supports the identification of global population structures within several bacterial genera that include pathogens. Here, we provide an overview on how EnteroBase works, what it can do, and its future prospects. EnteroBase has currently assembled more than 300,000 genomes from Illumina short reads from Salmonella, Escherichia, Yersinia, Clostridiodes, Helicobacter, Vibrio, and Moraxella, and genotyped those assemblies by core genome Multilocus Sequence Typing (cgMLST). Hierarchical clustering of cgMLST sequence types allows mapping a new bacterial strain to predefined population structures at multiple levels of resolution within a few hours after uploading its short reads. Case study 1 illustrates this process for local transmissions of Salmonella enterica serovar Agama between neighboring social groups of badgers and humans. EnteroBase also supports SNP calls from both genomic assemblies and after extraction from metagenomic sequences, as illustrated by case study 2 which summarizes the microevolution of Yersinia pestis over the last 5,000 years of pandemic plague. EnteroBase can also provide a global overview of the genomic diversity within an entire genus, as illustrated by case study 3 which presents a novel, global overview of the population structure of all of the species, subspecies and clades within Escherichia
Comparison of classical multi-locus sequence typing software for next-generation sequencing data
Multi-locus sequence typing (MLST) is a widely used method for categorizing bacteria. Increasingly, MLST is being performed using next-generation sequencing (NGS) data by reference laboratories and for clinical diagnostics. Many software applications have been developed to calculate sequence types from NGS data; however, there has been no comprehensive review to date on these methods. We have compared eight of these applications against real and simulated data, and present results on: (1) the accuracy of each method against traditional typing methods, (2) the performance on real outbreak datasets, (3) the impact of contamination and varying depth of coverage, and (4) the computational resource requirements
GrapeTree : visualization of core genomic relationships among 100,000 bacterial pathogens
Current methods struggle to reconstruct and visualise the genomic relationships of ≥100,000 bacterial genomes. GrapeTree facilitates the analyses of allelic profiles from 10,000's of core genomes within a web browser window. GrapeTree implements a novel minimum spanning tree algorithm to reconstruct genetic relationships despite missing data together with a static "GrapeTree Layout" algorithm to render interactive visualisations of large trees. GrapeTree is a stand-along package for investigating Newick trees plus associated metadata and is also integrated into EnteroBase to facilitate cutting edge navigation of genomic relationships among >160,000 genomes from bacterial pathogens. The GrapeTree package was released under the GPL v3.0 Licence
GrapeTree : visualization of core genomic relationships among 100,000 bacterial pathogens
Current methods struggle to reconstruct and visualise the genomic relationships of ≥100,000 bacterial genomes. GrapeTree facilitates the analyses of allelic profiles from 10,000's of core genomes within a web browser window. GrapeTree implements a novel minimum spanning tree algorithm to reconstruct genetic relationships despite missing data together with a static "GrapeTree Layout" algorithm to render interactive visualisations of large trees. GrapeTree is a stand-along package for investigating Newick trees plus associated metadata and is also integrated into EnteroBase to facilitate cutting edge navigation of genomic relationships among >160,000 genomes from bacterial pathogens. The GrapeTree package was released under the GPL v3.0 Licence
Naming the unnamed: over 65,000 Candidatus names for unnamed Archaea and Bacteria in the Genome Taxonomy Database
Thousands of new bacterial and archaeal species and higher-level taxa are discovered each year through the analysis of genomes and metagenomes. The Genome Taxonomy Database (GTDB) provides hierarchical sequence-based descriptions and classifications for new and as-yet-unnamed taxa. However, bacterial nomenclature, as currently configured, cannot keep up with the need for new well-formed names. Instead, microbiologists have been forced to use hard-to-remember alphanumeric placeholder labels. Here, we exploit an approach to the generation of well-formed arbitrary Latinate names at a scale sufficient to name tens of thousands of unnamed taxa within GTDB. These newly created names represent an important resource for the microbiology community, facilitating communication between bioinformaticians, microbiologists and taxonomists, while populating the emerging landscape of microbial taxonomic and functional discovery with accessible and memorable linguistic labels
Genome-wide identification and characterization of a superfamily of bacterial extracellular contractile injection systems
Mechanisms involved in acquisition of blaNDM genes by IncA/C2 and IncFIIY plasmids
blaNDM genes confer carbapenem resistance and have been identified on transferable plasmids belonging to different incompatibility (Inc) groups. Here we present the complete sequences of four plasmids carrying a blaNDM gene, pKP1-NDM-1, pEC2-NDM-3, pECL3-NDM-1, and pEC4-NDM-6, from four clinical samples originating from four different patients. Different plasmids carry segments that align to different parts of the blaNDM region found on Acinetobacter plasmids. pKP1-NDM-1 and pEC2-NDM-3, from Klebsiella pneumoniae and Escherichia coli, respectively, were identified as type 1 IncA/C2 plasmids with almost identical backbones. Different regions carrying blaNDM are inserted in different locations in the antibiotic resistance island known as ARI-A, and ISCR1 may have been involved in the acquisition of blaNDM-3 by pEC2-NDM-3. pECL3-NDM-1 and pEC4-NDM-6, from Enterobacter cloacae and E. coli, respectively, have similar IncFIIY backbones, but different regions carrying blaNDM are found in different locations. Tn3-derived inverted-repeat transposable elements (TIME) appear to have been involved in the acquisition of blaNDM-6 by pEC4-NDM-6 and the rmtC 16S rRNA methylase gene by IncFIIY plasmids. Characterization of these plasmids further demonstrates that even very closely related plasmids may have acquired blaNDM genes by different mechanisms. These findings also illustrate the complex relationships between antimicrobial resistance genes, transposable elements, and plasmids and provide insights into the possible routes for transmission of blaNDM genes among species of the Enterobacteriaceae family
Data for Millennia of genomic stability within the invasive Para C Lineage of Salmonella enterica: date estimation 1
Salmonella enterica serovar Paratyphi C is the causative agent of enteric (paratyphoid) fever. While today a potentially lethal infection of humans that occurs in Africa and Asia, early 20th century observations in Eastern Europe suggest it may once have had a wider-ranging impact on human societies. We recovered a draft Paratyphi C genome from the 800-year-old skeleton of a young woman in Trondheim, Norway, who likely died of enteric fever. Analysis of this genome against a new, significantly expanded database of related modern genomes demonstrated that Paratyphi C is descended from the ancestors of swine pathogens, serovars Choleraesuis and Typhisuis, together forming the Para C Lineage. Our results indicate that Paratyphi C has been a pathogen of humans for at least 1,000 years, and may have evolved after zoonotic transfer from swine during the Neolithic period
Genomic diversity and epidemiological significance of non-typhoidal Salmonella found in retail food collected in Norfolk, UK
Non-typhoidal Salmonella (NTS) is a major cause of bacterial gastroenteritis. Although many countries have implemented whole genome sequencing (WGS) of NTS, there is limited knowledge on NTS diversity on food and its contribution to human disease. In this study, the aim was to characterise the NTS genomes from retail foods in a particular region of the UK and assess the contribution to human NTS infections. Raw food samples were collected at retail in a repeated cross-sectional design in Norfolk, UK, including chicken (n=311), leafy green (n=311), pork (n=311), prawn (n=279) and salmon (n=157) samples. Up to eight presumptive NTS isolates per positive sample underwent WGS and were compared to publicly available NTS genomes from UK human cases. NTS was isolated from chicken (9.6 %), prawn (2.9 %) and pork (1.3 %) samples and included 14 serovars, of which Salmonella Infantis and Salmonella Enteritidis were the most common. The S. Enteritidis isolates were only isolated from imported chicken. No antimicrobial resistance determinants were found in prawn isolates, whilst 5.1 % of chicken and 0.64 % of pork samples contained multi-drug resistant NTS. The maximum number of pairwise core non-recombinant single nucleotide polymorphisms (SNPs) amongst isolates from the same sample was used to measure diversity and most samples had a median of two SNPs (range: 0–251). NTS isolates that were within five SNPs to clinical UK isolates belonged to specific serovars: S. Enteritidis and S. Infantis (chicken), and S. I 4,[5],12:i- (pork and chicken). Most NTS isolates that were closely related to human-derived isolates were obtained from imported chicken, but further epidemiological data are required to assess definitively the probable source of the human cases. Continued WGS surveillance of Salmonella on retail food involving multiple isolates from each sample is necessary to capture the diversity of Salmonella and determine the relative importance of different sources of human disease
Evolution of Salmonella enterica serotype Typhimurium driven by anthropogenic selection and niche adaptation
Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and a minor phylogroup of ST36. The major phylogroup had a population structure with two high order clades (α and β) and multiple subclades on extended internal branches, that exhibited distinct signatures of host adaptation and anthropogenic selection. Clade α contained a number of subclades composed of strains from well characterized epidemics in domesticated animals, while clade β contained multiple subclades associated with wild avian species. The contrasting epidemiology of strains in clade α and β was reflected by the distinct distribution of antimicrobial resistance (AMR) genes, accumulation of hypothetically disrupted coding sequences (HDCS), and signatures of functional diversification. These observations were consistent with elevated anthropogenic selection of clade α lineages from adaptation to circulation in populations of domesticated livestock, and the predisposition of clade β lineages to undergo adaptation to an invasive lifestyle by a process of convergent evolution with of host adapted Salmonella serotypes. Gene flux was predominantly driven by acquisition and recombination of prophage and associated cargo genes, with only occasional loss of these elements. The acquisition of large chromosomally-encoded genetic islands was limited, but notably, a feature of two recent pandemic clones (DT104 and monophasic S. Typhimurium ST34) of clade α (SGI-1 and SGI-4)
- …
