27 research outputs found

    Integrating standardized whole genome sequence analysis with a global Mycobacterium tuberculosis antibiotic resistance knowledgebase.

    Get PDF
    Drug-resistant tuberculosis poses a persistent public health threat. The ReSeqTB platform is a collaborative, curated knowledgebase, designed to standardize and aggregate global Mycobacterium tuberculosis complex (MTBC) variant data from whole genome sequencing (WGS) with phenotypic drug susceptibility testing (DST) and clinical data. We developed a unified analysis variant pipeline (UVP) ( https://github.com/CPTR-ReSeqTB/UVP ) to identify variants and assign lineage from MTBC sequence data. Stringent thresholds and quality control measures were incorporated in this open source tool. The pipeline was validated using a well-characterized dataset of 90 diverse MTBC isolates with conventional DST and DNA Sanger sequencing data. The UVP exhibited 98.9% agreement with the variants identified using Sanger sequencing and was 100% concordant with conventional methods of assigning lineage. We analyzed 4636 publicly available MTBC isolates in the ReSeqTB platform representing all seven major MTBC lineages. The variants detected have an above 94% accuracy of predicting drug based on the accompanying DST results in the platform. The aggregation of variants over time in the platform will establish confidence-graded mutations statistically associated with phenotypic drug resistance. These tools serve as critical reference standards for future molecular diagnostic assay developers, researchers, public health agencies and clinicians working towards the control of drug-resistant tuberculosis

    A standardised method for interpreting the association between mutations and phenotypic drug resistance inMycobacterium tuberculosis

    Get PDF
    A clear understanding of the genetic basis of antibiotic resistance in Mycobacterium tuberculosis is required to accelerate the development of rapid drug susceptibility testing methods based on genetic sequence. Raw genotype–phenotype correlation data were extracted as part of a comprehensive systematic review to develop a standardised analytical approach for interpreting resistance associated mutations for rifampicin, isoniazid, ofloxacin/levofloxacin, moxifloxacin, amikacin, kanamycin, capreomycin, streptomycin, ethionamide/prothionamide and pyrazinamide. Mutation frequencies in resistant and susceptible isolates were calculated, together with novel statistical measures to classify mutations as high, moderate, minimal or indeterminate confidence for predicting resistance. We identified 286 confidence-graded mutations associated with resistance. Compared to phenotypic methods, sensitivity (95% CI) for rifampicin was 90.3% (89.6–90.9%), while for isoniazid it was 78.2% (77.4–79.0%) and their specificities were 96.3% (95.7–96.8%) and 94.4% (93.1–95.5%), respectively. For second-line drugs, sensitivity varied from 67.4% (64.1–70.6%) for capreomycin to 88.2% (85.1–90.9%) for moxifloxacin, with specificity ranging from 90.0% (87.1–92.5%) for moxifloxacin to 99.5% (99.0–99.8%) for amikacin. This study provides a standardised and comprehensive approach for the interpretation of mutations as predictors of M. tuberculosis drug-resistant phenotypes. These data have implications for the clinical interpretation of molecular diagnostics and next-generation sequencing as well as efficient individualised therapy for patients with drug-resistant tuberculosis

    Fine-scale differentiation between Bacillus anthracis and Bacillus cereus group signatures in metagenome shotgun data

    No full text
    Background. It is possible to detect bacterial species in shotgun metagenome datasets through the presence of only a few sequence reads. However, false positive results can arise, as was the case in the initial findings of a recent New York City subway metagenome project. False positives are especially likely when two closely related are present in the same sample. Bacillus anthracis, the etiologic agent of anthrax, is a high-consequence pathogen that shares > 99% average nucleotide identity with Bacillus cereus group (BCerG) genomes. Our goal was to create an analysis tool that used k-mers to detect B. anthracis, incorporating information about the coverage of BCerG in the metagenome sample. Methods. Using public complete genome sequence datasets, we identified a set of 31-mer signatures that differentiated B. anthracis from other members of the B. cereus group (BCerG), and another set which differentiated BCerG genomes (including B. anthracis) from other Bacillus strains. We also created a set of 31-mers for detecting the lethal factor gene, the key genetic diagnostic of the presence of anthrax-causing bacteria. We created synthetic sequence datasets based on existing genomes to test the accuracy of a k-mer based detection model. Results. We found 239,503 B. anthracis-specific 31-mers (the Ba31 set ), 10,183 BCerG 31-mers (the BCerG31 set ), and 2,617 lethal factor k-mers (the lef31 set). We showed that false positive B. anthracis k-mers-which arise from random sequencing errors- are observable at high genome coverages of B. cereus. We also showed that there is a "gray zone" below 0.184× coverage of the B. anthracis genome sequence, in which we cannot expect with high probability to identify lethal factor k-mers. We created a linear regression model to differentiate the presence of B. anthracis-like chromosomes from sequencing errors given the BCerG background coverage. We showed that while shotgun datasets from the New York City subway metagenome project had no matches to lef31 k-mers and hence were negative for B. anthracis, some samples showed evidence of strains very closely related to the pathogen. Discussion. This work shows how extensive libraries of complete genomes can be used to create organism-specific signatures to help interpret metagenomes. We contrast "specialist" approaches to metagenome analysis such as this work to "generalist" software that seeks to classify all organisms present in the sample and note the more general utility of a k-mer filter approach when taxonomic boundaries lack clarity or high levels of precision are required.</p
    corecore