67 research outputs found

    mlplasmids : a user-friendly tool to predict plasmid- and chromosome-derived sequences for single species

    Get PDF
    Assembly of bacterial short-read whole-genome sequencing data frequently results in hundreds of contigs for which the origin, plasmid or chromosome, is unclear. Complete genomes resolved by long-read sequencing can be used to generate and label short-read contigs. These were used to train several popular machine learning methods to classify the origin of contigs from Enterococcus faecium, Klebsiella pneumoniae and Escherichia colt using pentamer frequencies. We selected support-vector machine (SVM) models as the best classifier for all three bacterial species (Fl-score E. faecium=0.92, F1-score K. pneumoniae=0.90, F1-score E. coli=0.76), which outperformed other existing plasmid prediction tools using a benchmarking set of isolates. We demonstrated the scalability of our models by accurately predicting the plasmidome of a large collection of 1644 E. faecium isolates and illustrate its applicability by predicting the location of antibiotic-resistance genes in all three species. The SVM classifiers are publicly available as an R package and graphical-user interface called 'mlplasmids'. We anticipate that this tool may significantly facilitate research on the dissemination of plasmids encoding antibiotic resistance and/or contributing to host adaptation.Peer reviewe

    gplas : a comprehensive tool for plasmid analysis using short-read graphs

    Get PDF
    aSummary: Plasmids can horizontally transmit genetic traits, enabling rapid bacterial adaptation to new environments and hosts. Short-read whole-genome sequencing data are often applied to large-scale bacterial comparative genomics projects but the reconstruction of plasmids from these data is facing severe limitations, such as the inability to distinguish plasmids from each other in a bacterial genome. We developed gplas, a new approach to reliably separate plasmid contigs into discrete components using sequence composition, coverage, assembly graph information and network partitioning based on a pruned network of plasmid unitigs. Gplas facilitates the analysis of large numbers of bacterial isolates and allows a detailed analysis of plasmid epidemiology based solely on short-read sequence data.Peer reviewe

    Mode and dynamics of vanA-type vancomycin resistance dissemination in Dutch hospitals

    Get PDF
    Abstract Background Enterococcus faecium is a commensal of the gastrointestinal tract of animals and humans but also a causative agent of hospital-acquired infections. Resistance against glycopeptides and to vancomycin has motivated the inclusion of E. faecium in the WHO global priority list. Vancomycin resistance can be conferred by the vanA gene cluster on the transposon Tn1546, which is frequently present in plasmids. The vanA gene cluster can be disseminated clonally but also horizontally either by plasmid dissemination or by Tn1546 transposition between different genomic locations. Methods We performed a retrospective study of the genomic epidemiology of 309 vancomycin-resistant E. faecium (VRE) isolates across 32 Dutch hospitals (2012–2015). Genomic information regarding clonality and Tn1546 characterization was extracted using hierBAPS sequence clusters (SC) and TETyper, respectively. Plasmids were predicted using gplas in combination with a network approach based on shared k-mer content. Next, we conducted a pairwise comparison between isolates sharing a potential epidemiological link to elucidate whether clonal, plasmid, or Tn1546 spread accounted for vanA-type resistance dissemination. Results On average, we estimated that 59% of VRE cases with a potential epidemiological link were unrelated which was defined as VRE pairs with a distinct Tn1546 variant. Clonal dissemination accounted for 32% cases in which the same SC and Tn1546 variants were identified. Horizontal plasmid dissemination accounted for 7% of VRE cases, in which we observed VRE pairs belonging to a distinct SC but carrying an identical plasmid and Tn1546 variant. In 2% of cases, we observed the same Tn1546 variant in distinct SC and plasmid types which could be explained by mixed and consecutive events of clonal and plasmid dissemination. Conclusions In related VRE cases, the dissemination of the vanA gene cluster in Dutch hospitals between 2012 and 2015 was dominated by clonal spread. However, we also identified outbreak settings with high frequencies of plasmid dissemination in which the spread of resistance was mainly driven by horizontal gene transfer (HGT). This study demonstrates the feasibility of distinguishing between modes of dissemination with short-read data and provides a novel assessment to estimate the relative contribution of nested genomic elements in the dissemination of vanA-type resistance

    Metagenomic assembly is the main bottleneck in the identification of mobile genetic elements

    Get PDF
    Antimicrobial resistance genes (ARG) are commonly found on acquired mobile genetic elements (MGEs) such as plasmids or transposons. Understanding the spread of resistance genes associated with mobile elements (mARGs) across different hosts and environments requires linking ARGs to the existing mobile reservoir within bacterial communities. However, reconstructing mARGs in metagenomic data from diverse ecosystems poses computational challenges, including genome fragment reconstruction (assembly), high-throughput annotation of MGEs, and identification of their association with ARGs. Recently, several bioinformatics tools have been developed to identify assembled fragments of plasmids, phages, and insertion sequence (IS) elements in metagenomic data. These methods can help in understanding the dissemination of mARGs. To streamline the process of identifying mARGs in multiple samples, we combined these tools in an automated high-throughput open-source pipeline, MetaMobilePicker, that identifies ARGs associated with plasmids, IS elements and phages, starting from short metagenomic sequencing reads. This pipeline was used to identify these three elements on a simplified simulated metagenome dataset, comprising whole genome sequences from seven clinically relevant bacterial species containing 55 ARGs, nine plasmids and five phages. The results demonstrated moderate precision for the identification of plasmids (0.57) and phages (0.71), and moderate sensitivity of identification of IS elements (0.58) and ARGs (0.70). In this study, we aim to assess the main causes of this moderate performance of the MGE prediction tools in a comprehensive manner. We conducted a systematic benchmark, considering metagenomic read coverage, contig length cutoffs and investigating the performance of the classification algorithms. Our analysis revealed that the metagenomic assembly process is the primary bottleneck when linking ARGs to identified MGEs in short-read metagenomics sequencing experiments rather than ARGs and MGEs identification by the different tools

    PlasmidEC and gplas2: an optimized short-read approach to predict and reconstruct antibiotic resistance plasmids in Escherichia coli.

    Get PDF
    Accurate reconstruction of Escherichia coli antibiotic resistance gene (ARG) plasmids from Illumina sequencing data has proven to be a challenge with current bioinformatic tools. In this work, we present an improved method to reconstruct E. coli plasmids using short reads. We developed plasmidEC, an ensemble classifier that identifies plasmid-derived contigs by combining the output of three different binary classification tools. We showed that plasmidEC is especially suited to classify contigs derived from ARG plasmids with a high recall of 0.941. Additionally, we optimized gplas, a graph-based tool that bins plasmid-predicted contigs into distinct plasmid predictions. Gplas2 is more effective at recovering plasmids with large sequencing coverage variations and can be combined with the output of any binary classifier. The combination of plasmidEC with gplas2 showed a high completeness (median=0.818) and F1-Score (median=0.812) when reconstructing ARG plasmids and exceeded the binning capacity of the reference-based method MOB-suite. In the absence of long-read data, our method offers an excellent alternative to reconstruct ARG plasmids in E. coli

    Mge-cluster: a reference-free approach for typing bacterial plasmids

    Get PDF
    Extrachromosomal elements of bacterial cells such as plasmids are notorious for their importance in evolution and adaptation to changing ecology. However, high-resolution population-wide analysis of plasmids has only become accessible recently with the advent of scalable long-read sequencing technology. Current typing methods for the classification of plasmids remain limited in their scope which motivated us to develop a computationally efficient approach to simultaneously recognize novel types and classify plasmids into previously identified groups. Here, we introduce mge-cluster that can easily handle thousands of input sequences which are compressed using a unitig representation in a de Bruijn graph. Our approach offers a faster runtime than existing algorithms, with moderate memory usage, and enables an intuitive visualization, classification and clustering scheme that users can explore interactively within a single framework. M ge-cluster platform for plasmid analysis can be easily distributed and replicated, enabling a consistent labelling of plasmids across past, present, and future sequence collections. We underscore the advantages of our approach by analysing a population-wide plasmid data set obtained from the opportunistic pathogen Escherichia coli, studying the prevalence of the colistin resistance gene mcr-1.1 within the plasmid population, and describing an instance of resistance plasmid transmission within a hospital environment

    Design of the EPIGENEC Study: Assessing the EPIdemiology and GENetics of Escherichia coli in the Netherlands

    Get PDF
    Background: Infections caused by E. coli cause considerable disease burden and range from frequently occurring and relatively innocent urinary tract infection (UTI) to severe bloodstream infection (BSI). The incidence of infections caused by ESBL-producing E. coli (ESBL-PEc) is increasing, justifying surveillance and development of preventive strategies in several domains. Faecal carriage is universal and believed to be the most important reservoir for E. coli from which infections can originate. It is currently unknown to what extent Dutch E. coli carriage strains in the community reflect isolates causing disease. In this study, we will perform comparative genomics to infer the population structures of human-derived ESBL-PEc from community- and hospital-acquired infections and from community-based faecal carriage samples in the Netherlands. Furthermore, we will describe the molecular epidemiology of E. coli isolates causing invasive disease (BSI). Methods: This study uses four different microbiological data sources: 1) ESBL-PEc from patients with community-acquired UTI tested in primary care between May and November 2017, 2) ESBL-PEc from urine cultures obtained from patients hospitalized between January 2014 and December 2016, 3) E. coli from blood cultures obtained from patients hospitalized between January 2014 and December 2016, and 4) ESBL-PEc from faecal samples collected in a national population- prevalence study performed between January 2014 and January 2017. Clinical epidemiological data was collected from all patients and all isolates were subjected to whole genome sequencing. Discussion: The EPIGENEC study (EPIdemiology and GENetics of E. coli) will describe the molecular epidemiology of E. coli BSI and assess the genomic population structure of ESBL-PEc strains from community-acquired and nosocomial infections, and of ESBL-PEc reflecting community-based faecal carriage. Information from these studies may assist in optimizing surveillance strategies and determining targets and potential impact of future new preventive measures

    Characterization of blaKPC-2 and blaNDM-1 Plasmids of a K. pneumoniae ST11 Outbreak Clone

    Get PDF
    The most common resistance mechanism to carbapenems is the production of carbapenemases. In 2021, the Pan American Health Organization warned of the emergence and increase in new carbapenemase combinations in Enterobacterales in Latin America. In this study, we characterized four Klebsiella pneumoniae isolates harboring blaKPC and blaNDM from an outbreak during the COVID-19 pandemic in a Brazilian hospital. We assessed their plasmids’ transference ability, fitness effects, and relative copy number in different hosts. The K. pneumoniae BHKPC93 and BHKPC104 strains were selected for whole genome sequencing (WGS) based on their pulsed-field gel electrophoresis profile. The WGS revealed that both isolates belong to ST11, and 20 resistance genes were identified in each isolate, including blaKPC-2 and blaNDM-1. The blaKPC gene was present on a ~56 Kbp IncN plasmid and the blaNDM-1 gene on a ~102 Kbp IncC plasmid, along with five other resistance genes. Although the blaNDM plasmid contained genes for conjugational transfer, only the blaKPC plasmid conjugated to E. coli J53, without apparent fitness effects. The minimum inhibitory concentrations (MICs) of meropenem/imipenem against BHKPC93 and BHKPC104 were 128/64 and 256/128 mg/L, respectively. Although the meropenem and imipenem MICs against E. coli J53 transconjugants carrying the blaKPC gene were 2 mg/L, this was a substantial increment in the MIC relative to the original J53 strain. The blaKPC plasmid copy number was higher in K. pneumoniae BHKPC93 and BHKPC104 than in E. coli and higher than that of the blaNDM plasmids. In conclusion, two ST11 K. pneumoniae isolates that were part of a hospital outbreak co-harbored blaKPC-2 and blaNDM-1. The blaKPC-harboring IncN plasmid has been circulating in this hospital since at least 2015, and its high copy number might have contributed to the conjugative transfer of this particular plasmid to an E. coli host. The observation that the blaKPC-containing plasmid had a lower copy number in this E. coli strain may explain why this plasmid did not confer phenotypic resistance against meropenem and imipenem
    • …
    corecore