9 research outputs found
Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications [version 1; referees: 2 approved]
The PubMLST.org website hosts a collection of open-access, curated databases that integrate population sequence data with provenance and phenotype information for over 100 different microbial species and genera. Although the PubMLST website was conceived as part of the development of the first multi-locus sequence typing (MLST) scheme in 1998 the software it uses, the Bacterial Isolate Genome Sequence database (BIGSdb, published in 2010), enables PubMLST to include all levels of sequence data, from single gene sequences up to and including complete, finished genomes. Here we describe developments in the BIGSdb software made from publication to June 2018 and show how the platform realises microbial population genomics for a wide range of applications. The system is based on the gene-by-gene analysis of microbial genomes, with each deposited sequence annotated and curated to identify the genes present and systematically catalogue their variation. Originally intended as a means of characterising isolates with typing schemes, the synthesis of sequences and records of genetic variation with provenance and phenotype data permits highly scalable (whole genome sequence data for tens of thousands of isolates) means of addressing a wide range of functional questions, including: the prediction of antimicrobial resistance; likely cross-reactivity with vaccine antigens; and the functional activities of different variants that lead to key phenotypes. There are no limitations to the number of sequences, genetic loci, allelic variants or schemes (combinations of loci) that can be included, enabling each database to represent an expanding catalogue of the genetic variation of the population in question. In addition to providing web-accessible analyses and links to third-party analysis and visualisation tools, the BIGSdb software includes a RESTful application programming interface (API) that enables access to all the underlying data for third-party applications and data analysis pipelines
The global meningitis genome partnership
GGenomic surveillance of bacterial meningitis pathogens is essential for effective disease control globally, enabling identification of emerging and expanding strains and consequent public health interventions. While there has been a rise in the use of whole genome sequencing, this has been driven predominately by a subset of countries with adequate capacity and resources. Global capacity to participate in surveillance needs to be expanded, particularly in low and middle-income countries with high disease burdens. In light of this, the WHO-led collaboration, Defeating Meningitis by 2030 Global Roadmap, has called for the establishment of a Global Meningitis Genome Partnership that links resources for: N. meningitidis (Nm), S. pneumoniae (Sp), H. influenzae (Hi) and S. agalactiae (Sa) to improve worldwide co-ordination of strain identification and tracking. Existing platforms containing relevant genomes include: PubMLST: Nm (31,622), Sp (15,132), Hi (1935), Sa (9026); The Wellcome Sanger Institute: Nm (13,711), Sp (> 24,000), Sa (6200), Hi (1738); and BMGAP: Nm (8785), Hi (2030). A steering group is being established to coordinate the initiative and encourage high-quality data curation. Next steps include: developing guidelines on open-access sharing of genomic data; defining a core set of metadata; and facilitating development of user-friendly interfaces that represent publicly available data
The global meningitis genome partnership.
Genomic surveillance of bacterial meningitis pathogens is essential for effective disease control globally, enabling identification of emerging and expanding strains and consequent public health interventions. While there has been a rise in the use of whole genome sequencing, this has been driven predominately by a subset of countries with adequate capacity and resources. Global capacity to participate in surveillance needs to be expanded, particularly in low and middle-income countries with high disease burdens. In light of this, the WHO-led collaboration, Defeating Meningitis by 2030 Global Roadmap, has called for the establishment of a Global Meningitis Genome Partnership that links resources for: N. meningitidis (Nm), S. pneumoniae (Sp), H. influenzae (Hi) and S. agalactiae (Sa) to improve worldwide co-ordination of strain identification and tracking. Existing platforms containing relevant genomes include: PubMLST: Nm (31,622), Sp (15,132), Hi (1935), Sa (9026); The Wellcome Sanger Institute: Nm (13,711), Sp (> 24,000), Sa (6200), Hi (1738); and BMGAP: Nm (8785), Hi (2030). A steering group is being established to coordinate the initiative and encourage high-quality data curation. Next steps include: developing guidelines on open-access sharing of genomic data; defining a core set of metadata; and facilitating development of user-friendly interfaces that represent publicly available data
Validation of a Bioinformatics Workflow for Routine Analysis of Whole-Genome Sequencing Data and Related Challenges for Pathogen Typing in a European National Reference Center: Neisseria meningitidis as a Proof-of-Concept
Despite being a well-established research method, the use of whole-genome sequencing (WGS) for routine molecular typing and pathogen characterization remains a substantial challenge due to the required bioinformatics resources and/or expertise. Moreover, many national reference laboratories and centers, as well as other laboratories working under a quality system, require extensive validation to demonstrate that employed methods are “fit-for-purpose” and provide high-quality results. A harmonized framework with guidelines for the validation of WGS workflows does currently, however, not exist yet, despite several recent case studies highlighting the urgent need thereof. We present a validation strategy focusing specifically on the exhaustive characterization of the bioinformatics analysis of a WGS workflow designed to replace conventionally employed molecular typing methods for microbial isolates in a representative small-scale laboratory, using the pathogen Neisseria meningitidis as a proof-of-concept. We adapted several classically employed performance metrics specifically toward three different bioinformatics assays: resistance gene characterization (based on the ARG-ANNOT, ResFinder, CARD, and NDARO databases), several commonly employed typing schemas (including, among others, core genome multilocus sequence typing), and serogroup determination. We analyzed a core validation dataset of 67 well-characterized samples typed by means of classical genotypic and/or phenotypic methods that were sequenced in-house, allowing to evaluate repeatability, reproducibility, accuracy, precision, sensitivity, and specificity of the different bioinformatics assays. We also analyzed an extended validation dataset composed of publicly available WGS data for 64 samples by comparing results of the different bioinformatics assays against results obtained from commonly used bioinformatics tools. We demonstrate high performance, with values for all performance metrics >87%, >97%, and >90% for the resistance gene characterization, sequence typing, and serogroup determination assays, respectively, for both validation datasets. Our WGS workflow has been made publicly available as a “push-button” pipeline for Illumina data at https://galaxy.sciensano.be to showcase its implementation for non-profit and/or academic usage. Our validation strategy can be adapted to other WGS workflows for other pathogens of interest and demonstrates the added value and feasibility of employing WGS with the aim of being integrated into routine use in an applied public health setting
Recommended from our members
Profiling potentially pathogenic bacteria from neonatal feeding tubes and sepsis cases
Recently, there has been a rise in the incidence of neonatal infections among babies born with low birth-weights and under-developed immune systems in neonatal intensive care units (NICUs). There are several risk factors to neonatal infection, the most important of which include the use of medical devices such as nasogastric enteral feeding tubes (NEFTs) and through contamination of infant feeding formula. Therefore, bacterial analysis of feeding tubes used in the NICU is important to identify infection risk factors during neonatal enteral feeding.
The aims of this study were (a) to determine the potential risk to neonates posed by ingestion of A. baumannii and Enterobacter spp., in particular E. hormaechei either through feeding tubes, infant formula, or by contaminated milk, (b) to determine whether some of the isolated strains originate from common sources, such as being transferred between the babies within specific neonatal units. Additionally, a longitudinal study for premature twin babies aimed to compare potentially pathogenic E. faecium isolates within and between the feeding tubes and faeces of twin babies over time.
PFGE indicated that all of the A. baumannii strains formed two different STs (ST193 and ST113). All ST113 strains were multidrug-resistant and demonstrated an ability to form significant biofilms at 37 °C in infant formula. Tolerance of acidic conditions, desiccation, resistance to human serum and persistence inside macrophages were shown by the majority of strains tested. E. hormaechei strains from feeding tubes exhibited similar behaviour to those isolated from sepsis cases, since both were able to adhere to and invade Caco-2 and HBMEC cell lines. Also, these strains were able to persist and replicate inside macrophages for up to 72 hours.
In the longitudinal study, all isolates of E. faecium isolated from preterm infant twins during their hospitalisation in the NICU were typed as ST80, belonging to clonal complex CC17. Furthermore, they were resistant to ampicillin and were found to carry several virulence-associated genes such as esp. All of these strains were found to be essentially the same strain based on their sequence type and genomic analysis and were shown to have high pathogenic potential. These strains isolated from different neonatal locations were indeed the same clone, showing that the bacteria were able to persist and be transferred between the two premature infants in the NICU.
This study has provided evidence of colonisation and persistence of opportunistic ESKAPE group pathogens in neonatal feeding tubes, which are important causes of nosocomial infection and dissemination of multidrug-resistant (MDR) strains
A RESTful application programming interface for the PubMLST molecular typing and genome databases
Molecular typing is used to differentiate microorganisms at the subspecies or strain level for epidemiological investigations, infection control, public health and environmental sampling. DNA sequence-based typing methods require authoritative databases that link sequence variants to nomenclature in order to facilitate communication and comparison of identified types in national or global settings. The PubMLST website (https://pubmlst.org/) fulfils this role for over a hundred microorganisms for which it hosts curated molecular sequence typing data, providing sequence and allelic profile definitions for multi-locus sequence typing (MLST) and single-gene typing approaches. In recent years, these have expanded to cover the whole genome with schemes such as core genome MLST (cgMLST) and whole genome MLST (wgMLST) which catalogue the allelic diversity found in hundreds to thousands of genes. These approaches provide a common nomenclature for high-resolution strain characterization and comparison. Molecular typing information is linked to isolate provenance, phenotype, and increasingly genome assemblies, providing a resource for outbreak investigation and research in to population structure, gene association, global epidemiology and vaccine coverage. A Representational State Transfer (REST) Application Programming Interface (API) has been developed for the PubMLST website to make these large quantities of structured molecular typing and whole genome sequence data available for programmatic access by any third party application. The API is an integral component of the Bacterial Isolate Genome Sequence Database (BIGSdb) platform that is used to host PubMLST resources, and exposes all public data within the site. In addition to data browsing, searching and download, the API supports authentication and submission of new data to curator queues