11,359 research outputs found

    Application of machine learning techniques to tuberculosis drug resistance analysis

    Get PDF
    Timely identification of Mycobacterium tuberculosis (MTB) resistance to existing drugs is vital to decrease mortality and prevent the amplification of existing antibiotic resistance. Machine learning methods have been widely applied for timely predicting resistance of MTB given a specific drug and identifying resistance markers. However, they have been not validated on a large cohort of MTB samples from multi-centers across the world in terms of resistance prediction and resistance marker identification. Several machine learning classifiers and linear dimension reduction techniques were developed and compared for a cohort of 13 402 isolates collected from 16 countries across 6 continents and tested 11 drugs. Results Compared to conventional molecular diagnostic test, area under curve of the best machine learning classifier increased for all drugs especially by 23.11%, 15.22% and 10.14% for pyrazinamide, ciprofloxacin and ofloxacin, respectively (P &lt; 0.01). Logistic regression and gradient tree boosting found to perform better than other techniques. Moreover, logistic regression/gradient tree boosting with a sparse principal component analysis/non-negative matrix factorization step compared with the classifier alone enhanced the best performance in terms of F1-score by 12.54%, 4.61%, 7.45% and 9.58% for amikacin, moxifloxacin, ofloxacin and capreomycin, respectively, as well increasing area under curve for amikacin and capreomycin. Results provided a comprehensive comparison of various techniques and confirmed the application of machine learning for better prediction of the large diverse tuberculosis data. Furthermore, mutation ranking showed the possibility of finding new resistance/susceptible markers. Availability and implementation The source code can be found at http://www.robots.ox.ac.uk/ davidc/code.php Supplementary information Supplementary data are available at Bioinformatics online. </jats:sec

    Application of machine learning techniques to tuberculosis drug resistance analysis

    Get PDF
    MOTIVATION: Timely identification of Mycobacterium tuberculosis (MTB) resistance to existing drugs is vital to decrease mortality and prevent the amplification of existing antibiotic resistance. Machine learning methods have been widely applied for timely predicting resistance of MTB given a specific drug and identifying resistance markers. However, they have been not validated on a large cohort of MTB samples from multi-centers across the world in terms of resistance prediction and resistance marker identification. Several machine learning classifiers and linear dimension reduction techniques were developed and compared for a cohort of 13 402 isolates collected from 16 countries across 6 continents and tested 11 drugs. RESULTS: Compared to conventional molecular diagnostic test, area under curve of the best machine learning classifier increased for all drugs especially by 23.11%, 15.22% and 10.14% for pyrazinamide, ciprofloxacin and ofloxacin, respectively (P < 0.01). Logistic regression and gradient tree boosting found to perform better than other techniques. Moreover, logistic regression/gradient tree boosting with a sparse principal component analysis/non-negative matrix factorization step compared with the classifier alone enhanced the best performance in terms of F1-score by 12.54%, 4.61%, 7.45% and 9.58% for amikacin, moxifloxacin, ofloxacin and capreomycin, respectively, as well increasing area under curve for amikacin and capreomycin. Results provided a comprehensive comparison of various techniques and confirmed the application of machine learning for better prediction of the large diverse tuberculosis data. Furthermore, mutation ranking showed the possibility of finding new resistance/susceptible markers. AVAILABILITY AND IMPLEMENTATION: The source code can be found at http://www.robots.ox.ac.uk/ davidc/code.php. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    Machine learning predicts accurately mycobacterium tuberculosis drug resistance from whole genome sequencing data

    Get PDF
    Background: Tuberculosis disease, caused by Mycobacterium tuberculosis, is a major public health problem. The emergence of M. tuberculosis strains resistant to existing treatments threatens to derail control efforts. Resistance is mainly conferred by mutations in genes coding for drug targets or converting enzymes, but our knowledge of these mutations is incomplete. Whole genome sequencing (WGS) is an increasingly common approach to rapidly characterize isolates and identify mutations predicting antimicrobial resistance and thereby providing a diagnostic tool to assist clinical decision making. Methods: We applied machine learning approaches to 16,688 M. tuberculosis isolates that have undergone WGS and laboratory drug-susceptibility testing (DST) across 14 antituberculosis drugs, with 22.5% of samples being multidrug resistant and 2.1% being extensively drug resistant. We used non-parametric classification-tree and gradientboosted-tree models to predict drug resistance and uncover any associated novel putative mutations. We fitted separate models for each drug, with and without “co-occurrent resistance” markers known to be causing resistance to drugs other than the one of interest. Predictive performance was measured using sensitivity, specificity, and the area under the receiver operating characteristic curve, assuming DST results as the gold standard. Results: The predictive performance was highest for resistance to first-line drugs, amikacin, kanamycin, ciprofloxacin, moxifloxacin, and multidrug-resistant tuberculosis (area under the receiver operating characteristic curve above 96%), and lowest for thirdline drugs such as D-cycloserine and Para-aminosalisylic acid (area under the curve below 85%). The inclusion of co-occurrent resistance markers led to improved performance for some drugs and superior results when compared to similar models in other largescale studies, which had smaller sample sizes. Overall, the gradient-boosted-tree models performed better than the classification-tree models. The mutation-rank analysis detected no new single nucleotide polymorphisms linked to drug resistance. Discordance between DST and genotypically inferred resistance may be explained by DST errors, novel rare mutations, hetero-resistance, and nongenomic drivers such as efflux-pump upregulation. Conclusion: Our work demonstrates the utility of machine learning as a flexible approach to drug resistance prediction that is able to accommodate a much larger number of predictors and to summarize their predictive ability, thus assisting clinical decision making and single nucleotide polymorphism detection in an era of increasing WGS data generation

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF

    An Interpretable Classification Method for Predicting Drug Resistance in M. Tuberculosis

    Get PDF
    Motivation: The prediction of drug resistance and the identification of its mechanisms in bacteria such as Mycobacterium tuberculosis, the etiological agent of tuberculosis, is a challenging problem. Modern methods based on testing against a catalogue of previously identified mutations often yield poor predictive performance. On the other hand, machine learning techniques have demonstrated high predictive accuracy, but many of them lack interpretability to aid in identifying specific mutations which lead to resistance. We propose a novel technique, inspired by the group testing problem and Boolean compressed sensing, which yields highly accurate predictions and interpretable results at the same time. Results: We develop a modified version of the Boolean compressed sensing problem for identifying drug resistance, and implement its formulation as an integer linear program. This allows us to characterize the predictive accuracy of the technique and select an appropriate metric to optimize. A simple adaptation of the problem also allows us to quantify the sensitivity-specificity trade-off of our model under different regimes. We test the predictive accuracy of our approach on a variety of commonly used antibiotics in treating tuberculosis and find that it has accuracy comparable to that of standard machine learning models and points to several genes with previously identified association to drug resistance

    Prediction of multidrug-resistant TB from CT pulmonary images based on deep learning techniques

    Get PDF
    While tuberculosis (TB) disease was discovered more than a century ago, it has not been eradicated yet. Quite contrary, at present, TB constitutes one of top 10 causes of death and has shown signs of increasing. To complement conventional diagnostic procedure of applying microbiological culture that takes several weeks and remains expensive, high resolution computer tomography (CT) of pulmonary images has been resorted to not only for aiding clinicians to expedite the process of diagnosis but also for monitoring prognosis when administrating antibiotic drugs. This research undertakes the investigation of predicting multi-drug resistant (MDR) patients from drug sensitive (DS) ones based on CT lung images to monitor the effectiveness of treatment. To contend with smaller datasets (i.e. in hundreds) and the characteristics of CT TB images with limited regions capturing abnormities, patch-based deep convolutional neural network (CNN) allied to support vector machine (SVM) classifier is implemented on a collection of datasets from 230 patients obtained from ImageCLEF 2017 competition. As a result, the proposed architecture of CNN+SVM+patch performs the best with classification accuracy rate at 91.11% (79.80% in terms of patches). In addition, hand-crafted SIFT based approach accomplishes 88.88% in terms of subject and 83.56% with reference to patches, the highest in this study, which can be explained away by the fact that the datasets are in small numbers. Significantly, during the Tuberculosis Competition at ImageCLEF 2017, the authors took part in the task of classification of 5 types of TB disease and achieved top one with regard to averaged classification accuracy (i.e. ACC = 0.4067), which is also premised on the approach of CNN+SVM+patch. On the other hand, when the whole slices of 3D TB datasets are applied to train a CNN network, the best result is achieved through the application of CNN coupled with orderless pooling and SVM at 64.71% accuracy rate

    Perspectives for systems biology in the management of tuberculosis

    Get PDF
    Standardised management of tuberculosis may soon be replaced by individualised, precision medicine-guided therapies informed with knowledge provided by the field of systems biology. Systems biology is a rapidly expanding field of computational and mathematical analysis and modelling of complex biological systems that can provide insights into mechanisms underlying tuberculosis, identify novel biomarkers, and help to optimise prevention, diagnosis and treatment of disease. These advances are critically important in the context of the evolving epidemic of drug-resistant tuberculosis. Here, we review the available evidence on the role of systems biology approaches - human and mycobacterial genomics and transcriptomics, proteomics, lipidomics/metabolomics, immunophenotyping, systems pharmacology and gut microbiomes - in the management of tuberculosis including prediction of risk for disease progression, severity of mycobacterial virulence and drug resistance, adverse events, comorbidities, response to therapy and treatment outcomes. Application of the Grading of Recommendations, Assessment, Development and Evaluation (GRADE) approach demonstrated that at present most of the studies provide "very low" certainty of evidence for answering clinically relevant questions. Further studies in large prospective cohorts of patients, including randomised clinical trials, are necessary to assess the applicability of the findings in tuberculosis prevention and more efficient clinical management of patients.Publisher PDFPeer reviewe

    Early Detection of Tuberculosis with Machine Learning Cough Audio Analysis: Towards More Accessible Global Triaging Usage

    Full text link
    Tuberculosis (TB), a bacterial disease mainly affecting the lungs, is one of the leading infectious causes of mortality worldwide. To prevent TB from spreading within the body, which causes life-threatening complications, timely and effective anti-TB treatment is crucial. Cough, an objective biomarker for TB, is a triage tool that monitors treatment response and regresses with successful therapy. Current gold standards for TB diagnosis are slow or inaccessible, especially in rural areas where TB is most prevalent. In addition, current machine learning (ML) diagnosis research, like utilizing chest radiographs, is ineffective and does not monitor treatment progression. To enable effective diagnosis, an ensemble model was developed that analyzes, using a novel ML architecture, coughs' acoustic epidemiologies from smartphones' microphones to detect TB. The architecture includes a 2D-CNN and XGBoost that was trained on 724,964 cough audio samples and demographics from 7 countries. After feature extraction (Mel-spectrograms) and data augmentation (IR-convolution), the model achieved AUROC (area under the receiving operator characteristic) of 88%, surpassing WHO's requirements for screening tests. The results are available within 15 seconds and can easily be accessible via a mobile app. This research helps to improve TB diagnosis through a promising accurate, quick, and accessible triaging tool

    Nanomotion technology in combination with machine learning: a new approach for a rapid antibiotic susceptibility test for Mycobacterium tuberculosis.

    Get PDF
    Nanomotion technology is a growth-independent approach that can be used to detect and record the vibrations of bacteria attached to cantilevers. We have developed a nanomotion-based antibiotic susceptibility test (AST) protocol for Mycobacterium tuberculosis (MTB). The protocol was used to predict strain phenotype towards isoniazid (INH) and rifampicin (RIF) using a leave-one-out cross-validation (LOOCV) and machine learning techniques. This MTB-nanomotion protocol takes 21 h, including cell suspension preparation, optimized bacterial attachment to functionalized cantilever, and nanomotion recording before and after antibiotic exposure. We applied this protocol to MTB isolates (n = 40) and were able to discriminate between susceptible and resistant strains for INH and RIF with a maximum sensitivity of 97.4% and 100%, respectively, and a maximum specificity of 100% for both antibiotics when considering each nanomotion recording to be a distinct experiment. Grouping recordings as triplicates based on source isolate improved sensitivity and specificity to 100% for both antibiotics. Nanomotion technology can potentially reduce time-to-result significantly compared to the days and weeks currently needed for current phenotypic ASTs for MTB. It can further be extended to other anti-TB drugs to help guide more effective TB treatment
    corecore