19 research outputs found

    A large scale evaluation of TBProfiler and Mykrobe for antibiotic resistance prediction in Mycobacterium tuberculosis

    Get PDF
    Recent years saw a growing interest in predicting antibiotic resistance from whole-genome sequencing data, with promising results obtained for Staphylococcus aureus and Mycobacterium tuberculosis. In this work, we gathered 6,574 sequencing read datasets of M. tuberculosis public genomes with associated antibiotic resistance profiles for both first and second-line antibiotics. We performed a systematic evaluation of TBProfiler and Mykrobe, two widely recognized softwares allowing to predict resistance in M. tuberculosis. The size of the dataset allowed us to obtain confident estimations of their overall predictive performance, to assess precisely the individual predictive power of the markers they rely on, and to study in addition how these softwares behave across the major M. tuberculosis lineages. While this study confirmed the overall good performance of these tools, it revealed that an important fraction of the catalog of mutations they embed is of limited predictive power. It also revealed that these tools offer different sensitivity/specificity trade-offs, which is mainly due to the different sets of mutation they embed but also to their underlying genotyping pipelines. More importantly, it showed that their level of predictive performance varies greatly across lineages for some antibiotics, therefore suggesting that the predictions made by these softwares should be deemed more or less confident depending on the lineage inferred and the predictive performance of the marker(s) actually detected. Finally, we evaluated the relevance of machine learning approaches operating from the set of markers detected by these softwares and show that they present an attractive alternative strategy, allowing to reach better performance for several drugs while significantly reducing the number of candidate mutations to consider

    Single-cell scattering and auto-fluorescence-based fast antibiotic susceptibility testing for gram-negative and gram-positive bacteria

    Get PDF
    In this study, we assess the scattering of light and auto-fluorescence from single bacterial cells to address the challenge of fast (<2 h), label-free phenotypic antimicrobial susceptibility testing (AST). Label-free flow cytometry is used for monitoring both the respiration-related auto-fluorescence in two different fluorescence channels corresponding to FAD and NADH, and the morphological and structural information contained in the light scattered by individual bacteria during incubation with or without antibiotic. Large multi-parameter data are analyzed using dimensionality reduction methods, based either on a combination of 2D binning and Principal Component Analysis, or with a one-class Support Vector Machine approach, with the objective to predict the Susceptible or Resistant phenotype of the strain. For the first time, both Escherichia coli (Gram-negative) and Staphylococcus epidermidis (Gram-positive) isolates were tested with a label-free approach, and, in the presence of two groups of bactericidal antibiotic molecules, aminoglycosides and beta-lactams. Our results support the feasibility of label-free AST in less than 2 h and suggest that single cell auto-fluorescence adds value to the Susceptible/Resistant phenotyping over single-cell scattering alone, in particular for the mecA+ Staphylococcus (i.e., resistant) strains treated with oxacillin

    The evolving SARS-CoV-2 epidemic in Africa: Insights from rapidly expanding genomic surveillance

    Get PDF
    INTRODUCTION Investment in Africa over the past year with regard to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequencing has led to a massive increase in the number of sequences, which, to date, exceeds 100,000 sequences generated to track the pandemic on the continent. These sequences have profoundly affected how public health officials in Africa have navigated the COVID-19 pandemic. RATIONALE We demonstrate how the first 100,000 SARS-CoV-2 sequences from Africa have helped monitor the epidemic on the continent, how genomic surveillance expanded over the course of the pandemic, and how we adapted our sequencing methods to deal with an evolving virus. Finally, we also examine how viral lineages have spread across the continent in a phylogeographic framework to gain insights into the underlying temporal and spatial transmission dynamics for several variants of concern (VOCs). RESULTS Our results indicate that the number of countries in Africa that can sequence the virus within their own borders is growing and that this is coupled with a shorter turnaround time from the time of sampling to sequence submission. Ongoing evolution necessitated the continual updating of primer sets, and, as a result, eight primer sets were designed in tandem with viral evolution and used to ensure effective sequencing of the virus. The pandemic unfolded through multiple waves of infection that were each driven by distinct genetic lineages, with B.1-like ancestral strains associated with the first pandemic wave of infections in 2020. Successive waves on the continent were fueled by different VOCs, with Alpha and Beta cocirculating in distinct spatial patterns during the second wave and Delta and Omicron affecting the whole continent during the third and fourth waves, respectively. Phylogeographic reconstruction points toward distinct differences in viral importation and exportation patterns associated with the Alpha, Beta, Delta, and Omicron variants and subvariants, when considering both Africa versus the rest of the world and viral dissemination within the continent. Our epidemiological and phylogenetic inferences therefore underscore the heterogeneous nature of the pandemic on the continent and highlight key insights and challenges, for instance, recognizing the limitations of low testing proportions. We also highlight the early warning capacity that genomic surveillance in Africa has had for the rest of the world with the detection of new lineages and variants, the most recent being the characterization of various Omicron subvariants. CONCLUSION Sustained investment for diagnostics and genomic surveillance in Africa is needed as the virus continues to evolve. This is important not only to help combat SARS-CoV-2 on the continent but also because it can be used as a platform to help address the many emerging and reemerging infectious disease threats in Africa. In particular, capacity building for local sequencing within countries or within the continent should be prioritized because this is generally associated with shorter turnaround times, providing the most benefit to local public health authorities tasked with pandemic response and mitigation and allowing for the fastest reaction to localized outbreaks. These investments are crucial for pandemic preparedness and response and will serve the health of the continent well into the 21st century

    Recherche de biomarqueurs par l’analyse multivariée d’images paramétriques multimodales pour le bilan non-invasif préchirurgical de l’épilepsie focale pharmaco-résistante

    Get PDF
    One third of patients suffering from epilepsy are resistant to medication. For these patients, surgical removal of the epileptogenic zone offers the possibility of a cure. Surgery success relies heavily on the accurate localization of the epileptogenic zone. The analysis of neuroimaging data such as magnetic resonance imaging (MRI) and positron emission tomography (PET) is increasingly used in the pre-surgical work-up of patients and may offer an alternative to the invasive reference of Stereo-electro-encephalo -graphy (SEEG) monitoring. To assist clinicians in screening these lesions, we developed a computer aided diagnosis system (CAD) based on a multivariate data analysis approach. Our first contribution was to formulate the problem of epileptogenic lesion detection as an outlier detection problem. The main motivation for this formulation was to avoid the dependence on labelled data and the class imbalance inherent to this detection task. The proposed system builds upon the one class support vector machines (OC-SVM) classifier. OC-SVM was trained using features extracted from MRI scans of healthy control subjects, allowing a voxelwise assessment of the deviation of a test subject pattern from the learned patterns. System performance was evaluated using realistic simulations of challenging detection tasks as well as clinical data of patients with intractable epilepsy. The outlier detection framework was further extended to take into account the specificities of neuroimaging data and the detection task at hand. We first proposed a reformulation of the support vector data description (SVDD) method to deal with the presence of uncertain observations in the training data. Second, to handle the multi-parametric nature of neuroimaging data, we proposed an optimal fusion approach for combining multiple base one-class classifiers. Finally, to help with score interpretation, threshold selection and score combination, we proposed to transform the score outputs of the outlier detection algorithm into well calibrated probabilities.Environ 150.000 personnes souffrent en France d'une épilepsie partielle réfractaire à tous les médicaments. La chirurgie, qui constitue aujourd’hui le meilleur recours thérapeutique nécessite un bilan préopératoire complexe. L'analyse de données d'imagerie telles que l’imagerie par résonance magnétique (IRM) anatomique et la tomographie d’émission de positons (TEP) au FDG (fluorodéoxyglucose) tend à prendre une place croissante dans ce protocole, et pourrait à terme limiter de recourir à l’électroencéphalographie intracérébrale (SEEG), procédure très invasive mais qui constitue encore la technique de référence. Pour assister les cliniciens dans leur tâche diagnostique, nous avons développé un système d'aide au diagnostic (CAD) reposant sur l'analyse multivariée de données d'imagerie. Compte tenu de la difficulté relative à la constitution de bases de données annotées et équilibrées entre classes, notre première contribution a été de placer l'étude dans le cadre méthodologique de la détection du changement. L'algorithme du séparateur à vaste marge adapté à ce cadre là (OC-SVM) a été utilisé pour apprendre, à partir de cartes multi-paramétriques extraites d'IRM T1 de sujets normaux, un modèle prédictif caractérisant la normalité à l'échelle du voxel. Le modèle permet ensuite de faire ressortir, dans les images de patients, les zones cérébrales suspectes s'écartant de cette normalité. Les performances du système ont été évaluées sur des lésions simulées ainsi que sur une base de données de patients. Trois extensions ont ensuite été proposées. D'abord un nouveau schéma de détection plus robuste à la présence de bruit d'étiquetage dans la base de données d'apprentissage. Ensuite, une stratégie de fusion optimale permettant la combinaison de plusieurs classifieurs OC-SVM associés chacun à une séquence IRM. Enfin, une généralisation de l'algorithme de détection d'anomalies permettant la conversion de la sortie du CAD en probabilité, offrant ainsi une meilleure interprétation de la sortie du système et son intégration dans le bilan pré-opératoire global

    PLoSONE_MRI_CAD_paper

    No full text
    Voxelwise one-class SVM predictive models used to estimate the support distribution of brain patterns extracted from MRI images of healthy control subjects.<br>SPM models and the results of the statistical analysis comparing 11 patients suffering from intractale epilepsy againt the same population of healthy control subjects.<br

    Example MIP of the detected cluster maps (blue) for patient #2 (MRI+) overlaid on the MIP of the expert delineated lesion (red).

    No full text
    <p>(a) OC-SVM distance map thresholded at <i>p</i> < 0.001; (b) SPM analysis based on the T-score map from the conjunction of both contrasts thresholded at <i>p</i> < 0.001 (c) SPM junction-based T-score map thresholded at <i>p</i> < 0.001; (d) SPM extension-based T-score map thresholded at <i>p</i> < 0.001.</p

    Comparison of OC-SVM and SPM classification performance.

    No full text
    <p>Data are differences in AUCs, with 95% confidence intervals in brackets. All differences are significant and in favour of OC-SVM, except for the detection of the blurred junction where no difference between the techniques can be shown.</p

    Hyper-parameter optimization curve.

    No full text
    <p>The bigger the width of the RBF kernel, the worse the generalisability due to the risk of under-fitting. Similarly, the smaller the value of <i>ν</i>, the higher the risk of over-fitting (fewer observations may be excluded); for better generalisability and given noise in medical images, the value should not be too small. Here, the pair (<i>ν</i> = 0.03, <i>σ</i> = 4) is therefore the optimal combination.</p
    corecore