21 research outputs found

    Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.

    Get PDF
    The Illumina DNA sequencing platform generates accurate but short reads, which can be used to produce accurate but fragmented genome assemblies. Pacific Biosciences and Oxford Nanopore Technologies DNA sequencing platforms generate long reads that can produce complete genome assemblies, but the sequencing is more expensive and error-prone. There is significant interest in combining data from these complementary sequencing technologies to generate more accurate "hybrid" assemblies. However, few tools exist that truly leverage the benefits of both types of data, namely the accuracy of short reads and the structural resolving power of long reads. Here we present Unicycler, a new tool for assembling bacterial genomes from a combination of short and long reads, which produces assemblies that are accurate, complete and cost-effective. Unicycler builds an initial assembly graph from short reads using the de novo assembler SPAdes and then simplifies the graph using information from short and long reads. Unicycler uses a novel semi-global aligner to align long reads to the assembly graph. Tests on both synthetic and real reads show Unicycler can assemble larger contigs with fewer misassemblies than other hybrid assemblers, even when long-read depth and accuracy are low. Unicycler is open source (GPLv3) and available at github.com/rrwick/Unicycler

    Completing bacterial genome assemblies with multiplex MinION sequencing

    Get PDF
    AbstractIllumina sequencing platforms have enabled widespread bacterial whole genome sequencing. While Illumina data is appropriate for many analyses, its short read length limits its ability to resolve genomic structure. This has major implications for tracking the spread of mobile genetic elements, including those which carry antimicrobial resistance determinants. Fully resolving a bacterial genome requires long-read sequencing such as those generated by Oxford Nanopore Technologies (ONT) platforms. Here we describe our use of the ONT MinION to sequence 12 isolates of Klebsiella pneumoniae on a single flow cell. We assembled each genome using a combination of ONT reads and previously available Illumina reads, and little to no manual intervention was needed to achieve fully resolved assemblies using the Unicycler hybrid assembler. Assembling only ONT reads with Canu was less effective, resulting in fewer resolved genomes and higher error rates even following error correction with Nanopolish. We demonstrate that multiplexed ONT sequencing is a valuable tool for high-throughput bacterial genome finishing. Specifically, we advocate the use of Illumina sequencing as a first analysis step, followed by ONT reads as needed to resolve genomic structure.Data summarySequence read files for all 12 isolates have been deposited in SRA, accessible through these NCBI BioSample accession numbers: SAMEA3357010, SAMEA3357043, SAMN07211279, SAMN07211280, SAMEA3357223, SAMEA3357193, SAMEA3357346, SAMEA3357374, SAMEA3357320, SAMN07211281, SAMN07211282, SAMEA3357405.A full list of SRA run accession numbers (both Illumina reads and ONT reads) for these samples are available in Table S1.Assemblies and sequencing reads corresponding to each stage of processing and analysis are provided in the following figshare project: https://figshare.com/projects/Completing_bacterial_genome_assemblies_with_multiplex_MinION_sequencing/23068Source code is provided in the following public GitHub repositories: https://github.com/rrwick/Bacterial-genome-assemblies-with-multiplex-MinION-sequencinghttps://github.com/rrwick/Porechophttps://github.com/rrwick/Fast5-to-FastqImpact StatementLike many research and public health laboratories, we frequently perform large-scale bacterial comparative genomics studies using Illumina sequencing, which assays gene content and provides the high-confidence variant calls needed for phylogenomics and transmission studies. However, problems often arise with resolving genome assemblies, particularly around regions that matter most to our research, such as mobile genetic elements encoding antibiotic resistance or virulence genes. These complexities can often be resolved by long sequence reads generated with PacBio or Oxford Nanopore Technologies (ONT) platforms. While effective, this has proven difficult to scale, due to the relatively high costs of generating long reads and the manual intervention required for assembly. Here we demonstrate the use of barcoded ONT libraries sequenced in multiplex on a single ONT MinION flow cell, coupled with hybrid assembly using Unicycler, to resolve 12 large bacterial genomes. Minor manual intervention was required to fully resolve small plasmids in five isolates, which we found to be underrepresented in ONT data. Cost per sample for the ONT sequencing was equivalent to Illumina sequencing, and there is potential for significant savings by multiplexing more samples on the ONT run. This approach paves the way for high-throughput and cost-effective generation of completely resolved bacterial genomes to become widely accessible.</jats:sec

    Genetic diversity, mobilisation and spread of the yersiniabactin-encoding mobile element ICEKp in Klebsiella pneumoniae populations.

    Get PDF
    Mobile genetic elements (MGEs) that frequently transfer within and between bacterial species play a critical role in bacterial evolution, and often carry key accessory genes that associate with a bacteria's ability to cause disease. MGEs carrying antimicrobial resistance (AMR) and/or virulence determinants are common in the opportunistic pathogen Klebsiella pneumoniae, which is a leading cause of highly drug-resistant infections in hospitals. Well-characterised virulence determinants in K. pneumoniae include the polyketide synthesis loci ybt and clb (also known as pks), encoding the iron-scavenging siderophore yersiniabactin and genotoxin colibactin, respectively. These loci are located within an MGE called ICEKp, which is the most common virulence-associated MGE of K. pneumoniae, providing a mechanism for these virulence factors to spread within the population. Here we apply population genomics to investigate the prevalence, evolution and mobility of ybt and clb in K. pneumoniae populations through comparative analysis of 2498 whole-genome sequences. The ybt locus was detected in 40 % of K. pneumoniae genomes, particularly amongst those associated with invasive infections. We identified 17 distinct ybt lineages and 3 clb lineages, each associated with one of 14 different structural variants of ICEKp. Comparison with the wider population of the family Enterobacteriaceae revealed occasional ICEKp acquisition by other members. The clb locus was present in 14 % of all K. pneumoniae and 38.4 % of ybt+ genomes. Hundreds of independent ICEKp integration events were detected affecting hundreds of phylogenetically distinct K. pneumoniae lineages, including at least 19 in the globally-disseminated carbapenem-resistant clone CG258. A novel plasmid-encoded form of ybt was also identified, representing a new mechanism for ybt dispersal in K. pneumoniae populations. These data indicate that MGEs carrying ybt and clb circulate freely in the K. pneumoniae population, including among multidrug-resistant strains, and should be considered a target for genomic surveillance along with AMR determinants

    Bridging of Neisseria gonorrhoeae lineages across sexual networks in the HIV pre-exposure prophylaxis era

    Get PDF
    Whole genome sequencing (WGS) has been used to investigate transmission of Neisseria gonorrhoeae, but to date, most studies have not combined genomic data with detailed information on sexual behaviour to define the extent of transmission across population risk groups (bridging). Here, through combined epidemiological and genomic analysis of 2,186N. gonorrhoeae isolates from Australia, we show widespread transmission of N. gonorrhoeae within and between population groups. We describe distinct transmission clusters associated with men who have sex with men (MSM) and heterosexuals, and men who have sex with men and women (MSMW) are identified as a possible bridging population between these groups. Further, the study identifies transmission of N. gonorrhoeae between HIV-positive and HIV-negative individuals receiving pre-exposure prophylaxis (PrEP). Our data highlight several groups that can be targeted for interventions aimed at improving gonorrhoea control, including returning travellers, sex workers, and PrEP users.D.A.W. (GNT1123854), E.P.F.C. (GNT1091226), and J.C.K. (GNT1142613) are supported by Early Career Fellowships from the National Health and Medical Research Council (NHMRC) of Australia. B.P.H. is supported by a Practitioner Fellowship from the NHMRC (GNT1105905). D.J.I. is supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement 643476. Work in this study was supported by a Project Grant from the NHMRC (GNT1147735) and a Partnership grant from the NHMRC (GNT1149991). MDU PHL is funded by the Victorian Department of Health and Human Services

    Antimicrobial resistant <i>Klebsiella pneumoniae</i> carriage and infection in specialized geriatric care wards linked to acquisition in the referring hospital

    Get PDF
    AbstractBackgroundKlebsiella pneumoniae is a leading cause of extended-spectrum beta-lactamase (ESBL) producing hospital-associated infections, for which elderly patients are at increased risk.MethodsWe conducted a 1-year prospective cohort study, in which a third of patients admitted to two geriatric wards in a specialized hospital were recruited and screened for carriage of K. pneumoniae by microbiological culture. Clinical isolates were monitored via the hospital laboratory. Colonizing and clinical isolates were subjected to whole genome sequencing and antimicrobial susceptibility testing.ResultsK. pneumoniae throat carriage prevalence was 4.1%, rectal carriage 10.8% and ESBL carriage 1.7%. K. pneumoniae infection incidence was 1.2%. The isolates were diverse, and most patients were colonized or infected with a unique phylogenetic lineage, with no evidence of transmission in the wards. ESBL strains carried blaCTX-M-15and belonged to clones associated with hospital-acquired ESBL infections in other countries (ST29, ST323, ST340).One also carried the carbapenemase blaIMP-26. Genomic and epidemiological data provided evidence that ESBL strains were acquired in the referring hospital. Nanopore sequencing also identified strain-to-strain transmission of a blaCTX-M-15 FIBK/FIIK plasmid in the referring hospital.ConclusionsThe data suggest the major source of K. pneumoniae was the patient’s own gut microbiome, but ESBL strains were acquired in the referring hospital. This highlights the importance of the wider hospital network to understanding K. pneumoniae risk and infection control. Rectal screening for ESBL organisms upon admission to geriatric wards could help inform patient management and infection control in such facilities.SummaryPatients’ own gut microbiota were the major source of K. pneumoniae, but extended-spectrum beta-lactamase strains were acquired in the referring hospital. This highlights the potential for rectal screening, and the importance of the wider hospital network, for local risk management.</jats:sec

    Genomic dissection of Klebsiella pneumoniae infections in hospital patients reveals insights into an opportunistic pathogen.

    Get PDF
    Klebsiella pneumoniae is a major cause of opportunistic healthcare-associated infections, which are increasingly complicated by the presence of extended-spectrum beta-lactamases (ESBLs) and carbapenem resistance. We conducted a year-long prospective surveillance study of K. pneumoniae clinical isolates in hospital patients. Whole-genome sequence (WGS) data reveals a diverse pathogen population, including other species within the K. pneumoniae species complex (18%). Several infections were caused by K. variicola/K. pneumoniae hybrids, one of which shows evidence of nosocomial transmission. A wide range of antimicrobial resistance (AMR) phenotypes are observed, and diverse genetic mechanisms identified (mainly plasmid-borne genes). ESBLs are correlated with presence of other acquired AMR genes (median n = 10). Bacterial genomic features associated with nosocomial onset are ESBLs (OR 2.34, p = 0.015) and rhamnose-positive capsules (OR 3.12, p < 0.001). Virulence plasmid-encoded features (aerobactin, hypermucoidy) are observed at low-prevalence (<3%), mostly in community-onset cases. WGS-confirmed nosocomial transmission is implicated in just 10% of cases, but strongly associated with ESBLs (OR 21, p < 1 × 10-11). We estimate 28% risk of onward nosocomial transmission for ESBL-positive strains vs 1.7% for ESBL-negative strains. These data indicate that K. pneumoniae infections in hospitalised patients are due largely to opportunistic infections with diverse strains, with an additional burden from nosocomially-transmitted AMR strains and community-acquired hypervirulent strains

    Simulated hybrid assemblies: Read length and accuracy.

    No full text
    <p>NGA50 values segregated by read length and read accuracy. These plots summarise results across all reference genomes and replicate tests, but only include the tests of 8x long-read depth. For read lengths, the <i>p</i>-value is from a two-tailed <i>t</i>-test. For read accuracies, the <i>p</i>-value is from a one-way ANOVA test.</p

    Reference genomes for simulated read sets.

    No full text
    <p>Reference genomes for simulated read sets.</p

    Simulated hybrid assemblies: Read length and accuracy.

    No full text
    <p>NGA50 values segregated by read length and read accuracy. These plots summarise results across all reference genomes and replicate tests, but only include the tests of 8x long-read depth. For read lengths, the <i>p</i>-value is from a two-tailed <i>t</i>-test. For read accuracies, the <i>p</i>-value is from a one-way ANOVA test.</p

    Simulated short-read assemblies: Errors.

    No full text
    <p>Misassembly and small-error (mismatches and indels) rates for assemblies of simulated short-read sets, summarising results across all reference genomes and replicate tests (total 360 per assembler).</p
    corecore