91 research outputs found

    Circlator: automated circularization of genome assemblies using long sequencing reads

    Get PDF
    The assembly of DNA sequence data is undergoing a renaissance thanks to emerging technologies capable of producing reads tens of kilobases long. Assembling complete bacterial and small eukaryotic genomes is now possible, but the final step of circularizing sequences remains unsolved. Here we present Circlator, the first tool to automate assembly circularization and produce accurate linear representations of circular sequences. Using Pacific Biosciences and Oxford Nanopore data, Circlator correctly circularized 26 of 27 circularizable sequences, comprising 11 chromosomes and 12 plasmids from bacteria, the apicoplast and mitochondrion of Plasmodium falciparum and a human mitochondrion. Circlator is available at http://sanger-pathogens.github.io/circlator/

    Comparison of classical multi-locus sequence typing software for next-generation sequencing data

    Get PDF
    Multi-locus sequence typing (MLST) is a widely used method for categorizing bacteria. Increasingly, MLST is being performed using next-generation sequencing (NGS) data by reference laboratories and for clinical diagnostics. Many software applications have been developed to calculate sequence types from NGS data; however, there has been no comprehensive review to date on these methods. We have compared eight of these applications against real and simulated data, and present results on: (1) the accuracy of each method against traditional typing methods, (2) the performance on real outbreak datasets, (3) the impact of contamination and varying depth of coverage, and (4) the computational resource requirements

    ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads.

    Get PDF
    Antimicrobial resistance (AMR) is one of the major threats to human and animal health worldwide, yet few high-throughput tools exist to analyse and predict the resistance of a bacterial isolate from sequencing data. Here we present a new tool, ARIBA, that identifies AMR-associated genes and single nucleotide polymorphisms directly from short reads, and generates detailed and customizable output. The accuracy and advantages of ARIBA over other tools are demonstrated on three datasets from Gram-positive and Gram-negative bacteria, with ARIBA outperforming existing methods

    Robust high-throughput prokaryote de novo assembly and improvement pipeline for Illumina data.

    Get PDF
    The rapidly reducing cost of bacterial genome sequencing has lead to its routine use in large-scale microbial analysis. Though mapping approaches can be used to find differences relative to the reference, many bacteria are subject to constant evolutionary pressures resulting in events such as the loss and gain of mobile genetic elements, horizontal gene transfer through recombination and genomic rearrangements. De novo assembly is the reconstruction of the underlying genome sequence, an essential step to understanding bacterial genome diversity. Here we present a high-throughput bacterial assembly and improvement pipeline that has been used to generate nearly 20 000 annotated draft genome assemblies in public databases. We demonstrate its performance on a public data set of 9404 genomes. We find all the genes used in multi-locus sequence typing schema present in 99.6 % of assembled genomes. When tested on low-, neutral- and high-GC organisms, more than 94 % of genes were present and completely intact. The pipeline has been proven to be scalable and robust with a wide variety of datasets without requiring human intervention. All of the software is available on GitHub under the GNU GPL open source license

    PlasmidTron: assembling the cause of phenotypes and genotypes from NGS data.

    Get PDF
    Increasingly rich metadata are now being linked to samples that have been whole-genome sequenced. However, much of this information is ignored. This is because linking this metadata to genes, or regions of the genome, usually relies on knowing the gene sequence(s) responsible for the particular trait being measured and looking for its presence or absence in that genome. Examples of this would be the spread of antimicrobial resistance genes carried on mobile genetic elements (MGEs). However, although it is possible to routinely identify the resistance gene, identifying the unknown MGE upon which it is carried can be much more difficult if the starting point is short-read whole-genome sequence data. The reason for this is that MGEs are often full of repeats and so assemble poorly, leading to fragmented consensus sequences. Since mobile DNA, which can carry many clinically and ecologically important genes, has a different evolutionary history from the host, its distribution across the host population will, by definition, be independent of the host phylogeny. It is possible to use this phenomenon in a genome-wide association study to identify both the genes associated with the specific trait and also the DNA linked to that gene, for example the flanking sequence of the plasmid vector on which it is encoded, which follows the same patterns of distribution as the marker gene/sequence itself. We present PlasmidTron, which utilizes the phenotypic data normally available in bacterial population studies, such as antibiograms, virulence factors, or geographical information, to identify traits that are likely to be present on DNA that can randomly reassort across defined bacterial populations. It is also possible to use this methodology to associate unknown genes/sequences (e.g. plasmid backbones) with a specific molecular signature or marker (e.g. resistance gene presence or absence) using PlasmidTron. PlasmidTron uses a k-mer-based approach to identify reads associated with a phylogenetically unlinked phenotype. These reads are then assembled de novo to produce contigs in a fast and scalable-to-large manner. PlasmidTron is written in Python 3 and is available under the open source licence GNU GPL3 from https://github.com/sanger-pathogens/plasmidtron

    The Unexpectedly Bright Comet C-2012 F6 (Lemmon) Unveiled at Near-Infrared Wavelengths

    Get PDF
    We acquired near-infrared spectra of the Oort cloud comet C/2012 F6 (Lemmon) at three different heliocentric distances (R h) during the comet's 2013 perihelion passage, providing a comprehensive measure of the outgassing behavior of parent volatiles and cosmogonic indicators. Our observations were performed pre-perihelion at R h = 1.2 AU with CRIRES (on 2013 February 2 and 4), and post-perihelion at R h = 0.75 AU with CSHELL (on March 31 and April 1) and R h = 1.74 AU with NIRSPEC (on June 20). We detected 10 volatile species (H2O, OH* prompt emission, C2H6, CH3OH, H2CO, HCN, CO, CH4, NH3, and NH2), and obtained upper limits for two others (C2H2 and HDO). One-dimensional spatial profiles displayed different distributions for some volatiles, confirming either the existence of polar and apolar ices, or of chemically distinct active vents in the nucleus. The ortho-para ratio for water was 3.31 +/- 0.33 (weighted mean of CRIRES and NIRSPEC results), implying a spin temperature >37 K at the 95% confidence limit. Our (3) upper limit for HDO corresponds to D/H < 2.45 10-3 (i.e., <16 Vienna Standard Mean Ocean Water, VSMOW). At R h = 1.2 AU (CRIRES), the production rate for water was Q(H2O) = 1.9 +/- 0.1 1029 s-1 and its rotational temperature was T rot ~ 69 K. At R h = 0.75 AU (CSHELL), we measured Q(H2O) = 4.6 +/- 0.6 1029 s-1 and T rot = 80 K on March 31, and 6.6 +/- 0.9 1029 s-1 and T rot = 100 K on April 1. At R h = 1.74 AU (NIRSPEC), we obtained Q(H2O) = 1.1 +/- 0.1 1029 s-1 and T rot ~ 50 K. The measured volatile abundance ratios classify comet C/2012 F6 as rather depleted in C2H6 and CH3OH, while HCN, CH4, and CO displayed abundances close to their median values found among comets. H2CO was the only volatile showing a relative enhancement. The relative paucity of C2H6 and CH3OH (with respect to H2O) suggests formation within warm regions of the nebula. However, the normal abundance of HCN and hypervolatiles CH4 and CO, and the enhancement of H2CO, may indicate a possible heterogeneous nucleus of comet C/2012 F6 (Lemmon), possibly as a result of radial mixing within the protoplanetary dis

    The origins of haplotype 58 (H58) Salmonella enterica serovar Typhi

    Get PDF
    Antimicrobial resistance (AMR) poses a serious threat to the clinical management of typhoid fever. AMR in Salmonella Typhi (S. Typhi) is commonly associated with the H58 lineage, a lineage that arose comparatively recently before becoming globally disseminated. To better understand when and how H58 emerged and became dominant, we performed detailed phylogenetic analyses on contemporary genome sequences from S. Typhi isolated in the period spanning the emergence. Our dataset, which contains the earliest described H58 S. Typhi organism, indicates that ancestral H58 organisms were already multi-drug resistant (MDR). These organisms emerged spontaneously in India in 1987 and became radially distributed throughout South Asia and then globally in the ensuing years. These early organisms were associated with a single long branch, possessing mutations associated with increased bile tolerance, suggesting that the first H58 organism was generated during chronic carriage. The subsequent use of fluoroquinolones led to several independent mutations in gyrA. The ability of H58 to acquire and maintain AMR genes continues to pose a threat, as extensively drug-resistant (XDR; MDR plus resistance to ciprofloxacin and third generation cephalosporins) variants, have emerged recently in this lineage. Understanding where and how H58 S. Typhi originated and became successful is key to understand how AMR drives successful lineages of bacterial pathogens. Additionally, these data can inform optimal targeting of typhoid conjugate vaccines (TCVs) for reducing the potential for emergence and the impact of new drug-resistant variants. Emphasis should also be placed upon the prospective identification and treatment of chronic carriers to prevent the emergence of new drug resistant variants with the ability to spread efficiently

    Phylogenetic Analysis of Klebsiella pneumoniae from Hospitalized Children, Pakistan.

    Get PDF
    Klebsiella pneumoniae shows increasing emergence of multidrug-resistant lineages, including strains resistant to all available antimicrobial drugs. We conducted whole-genome sequencing of 178 highly drug-resistant isolates from a tertiary hospital in Lahore, Pakistan. Phylogenetic analyses to place these isolates into global context demonstrate the expansion of multiple independent lineages, including K. quasipneumoniae.This work was supported by National Health and Medical Research Council program grants (0606788 to R.A.S. and T. L.; 1092262 to R.A.S., G.D., and T.L.); the Wellcome Trust (206194); and the Higher Education Commission of Pakistan and The Children’s Hospital & The Institute of Child Health, Lahore, Pakistan. H.E. was supported by a scholarship from Higher Education Commission Pakistan under the International Research Support Initiative Program

    Predicting the immediate impact of national lockdown on neovascular age-related macular degeneration and associated visual morbidity: an INSIGHT Health Data Research Hub for Eye Health report

    Get PDF
    OBJECTIVE: Predicting the impact of neovascular age-related macular degeneration (nAMD) service disruption on visual outcomes following national lockdown in the UK to contain SARS-CoV-2. METHODS AND ANALYSIS: This retrospective cohort study includes deidentified data from 2229 UK patients from the INSIGHT Health Data Research digital hub. We forecasted the number of treatment-naïve nAMD patients requiring anti-vascular endothelial growth factor (anti-VEGF) initiation during UK lockdown (16 March 2020 through 31 July 2020) at Moorfields Eye Hospital (MEH) and University Hospitals Birmingham (UHB). Best-measured visual acuity (VA) changes without anti-VEGF therapy were predicted using post hoc analysis of Minimally Classic/Occult Trial of the Anti-VEGF Antibody Ranibizumab in the Treatment of Neovascular AMD trial sham-control arm data (n=238). RESULTS: At our centres, 376 patients were predicted to require anti-VEGF initiation during lockdown (MEH: 325; UHB: 51). Without treatment, mean VA was projected to decline after 12 months. The proportion of eyes in the MEH cohort predicted to maintain the key positive visual outcome of ≥70 ETDRS letters (Snellen equivalent 6/12) fell from 25.5% at baseline to 5.8% at 12 months (UHB: 9.8%-7.8%). Similarly, eyes with VA <25 ETDRS letters (6/96) were predicted to increase from 4.3% to 14.2% at MEH (UHB: 5.9%-7.8%) after 12 months without treatment. CONCLUSIONS: Here, we demonstrate how combining data from a recently founded national digital health data repository with historical industry-funded clinical trial data can enhance predictive modelling in nAMD. The demonstrated detrimental effects of prolonged treatment delay should incentivise healthcare providers to support nAMD patients accessing care in safe environments. TRIAL REGISTRATION NUMBER: NCT00056836

    A retrospective investigation of the population structure and geospatial distribution of Salmonella Paratyphi A in Kathmandu, Nepal

    Get PDF
    Salmonella Paratyphi A, one of the major etiologic agents of enteric fever, has increased in prevalence in recent decades in certain endemic regions in comparison to S. Typhi, the most prevalent cause of enteric fever. Despite this increase, data on the prevalence and molecular epidemiology of S. Paratyphi A remain generally scarce. Here, we analysed the whole genome sequences of 216 S. Paratyphi A isolates originating from Kathmandu, Nepal between 2005 and 2014, of which 200 were from patients with acute enteric fever and 16 from the gallbladder of people with suspected chronic carriage. By exploiting the recently developed genotyping framework for S. Paratyphi A (Paratype), we identified several genotypes circulating in Kathmandu. Notably, we observed an unusual clonal expansion of genotype 2.4.3 over a four-year period that spread geographically and systematically replaced other genotypes. This rapid genotype replacement is hypothesised to have been driven by both reduced susceptibility to fluoroquinolones and genetic changes to virulence factors, such as functional and structural genes encoding the type 3 secretion systems. Finally, we show that person-to-person is likely the most common mode of transmission and chronic carriers seem to play a limited role in maintaining disease circulation
    • …
    corecore