5,589 research outputs found

    Bayesian reconstruction of Mycobacterium tuberculosis transmission networks in a high incidence area over two decades in Malawi reveals associated risk factors and genomic variants.

    Get PDF
    Understanding host and pathogen factors that influence tuberculosis (TB) transmission can inform strategies to eliminate the spread of Mycobacterium tuberculosis (Mtb). Determining transmission links between cases of TB is complicated by a long and variable latency period and undiagnosed cases, although methods are improving through the application of probabilistic modelling and whole-genome sequence analysis. Using a large dataset of 1857 whole-genome sequences and comprehensive metadata from Karonga District, Malawi, over 19 years, we reconstructed Mtb transmission networks using a two-step Bayesian approach that identified likely infector and recipient cases, whilst robustly allowing for incomplete case sampling. We investigated demographic and pathogen genomic variation associated with transmission and clustering in our networks. We found that whilst there was a significant decrease in the proportion of infectors over time, we found higher transmissibility and large transmission clusters for lineage 2 (Beijing) strains. By performing evolutionary convergence testing (phyC) and genome-wide association analysis (GWAS) on transmitting versus non-transmitting cases, we identified six loci, PPE54, accD2, PE_PGRS62, rplI, Rv3751 and Rv2077c, that were associated with transmission. This study provides a framework for reconstructing large-scale Mtb transmission networks. We have highlighted potential host and pathogen characteristics that were linked to increased transmission in a high-burden setting and identified genomic variants that, with validation, could inform further studies into transmissibility and TB eradication

    Tuberculosis: new era for diagnosis and surveillance using whole-genome sequencing-based approaches

    Get PDF
    Tuberculosis (TB) has been declared as a global public health emergency by the WHO since 1993. It still accounts for almost 2 million deaths each year, making it the ninth leading cause of death worldwide. The major obstacle for an effective TB control is antimicrobial resistance, thus, to be successful, new strategies must be addressed, for instance, the implementation of new rapid TB diagnostic technologies that could translate into early treatment initiation and blocking of transmission chains. Considering the major constraints regarding the isolation and time of growth of M. tuberculosis strains, the main goal of this PhD dissertation was to acknowledge the potential of the use of WGS-based methodologies for routine diagnostic and epidemiological surveillance. We evaluated several software for in silico prediction of antibiotic resistance and developed bioinformatics pipelines for surveillance purposes, in particular for the identification of transmission chains. As they revealed high sensitivity, these approaches are already implemented in the routine of the Portuguese National Reference Laboratory (NRL). We also recognised the possibility to use these same approaches directly to samples collected from TB patients, lowering the time-to-results, for a complete drug resistance pattern and phylogeny analysis, for five to eight days. The validation of this methodology is ongoing and will be implemented in a near future. Additionally, and according to the new recommendations for TB treatment, we have initiated studies to identify new mutations associated with resistance to the recently adopted drugs, in order to enrich the available databases and improve the performance of the genotypic diagnostics pipelines. This PhD dissertation highlights WGS-based methodologies as powerful tools to surpass the difficulties of phenotypic TB diagnosis and surveillance and to provide a much more rapid information regarding resistance prediction and eventual transmission chains. It also supported the technological transition performed at the NRL for TB surveillance

    Bedaquiline and clofazimine resistance in Mycobacterium tuberculosis: an in-vitro and in-silico data analysis

    Get PDF
    Background: Bedaquiline is a core drug for the treatment of multidrug-resistant tuberculosis; however, the understanding of resistance mechanisms is poor, which is hampering rapid molecular diagnostics. Some bedaquiline-resistant mutants are also cross-resistant to clofazimine. To decipher bedaquiline and clofazimine resistance determinants, we combined experimental evolution, protein modelling, genome sequencing, and phenotypic data. Methods: For this in-vitro and in-silico data analysis, we used a novel in-vitro evolutionary model using subinhibitory drug concentrations to select bedaquiline-resistant and clofazimine-resistant mutants. We determined bedaquiline and clofazimine minimum inhibitory concentrations and did Illumina and PacBio sequencing to characterise selected mutants and establish a mutation catalogue. This catalogue also includes phenotypic and genotypic data of a global collection of more than 14 000 clinical Mycobacterium tuberculosis complex isolates, and publicly available data. We investigated variants implicated in bedaquiline resistance by protein modelling and dynamic simulations. Findings: We discerned 265 genomic variants implicated in bedaquiline resistance, with 250 (94%) variants affecting the transcriptional repressor (Rv0678) of the MmpS5–MmpL5 efflux system. We identified 40 new variants in vitro, and a new bedaquiline resistance mechanism caused by a large-scale genomic rearrangement. Additionally, we identified in vitro 15 (7%) of 208 mutations found in clinical bedaquiline-resistant isolates. From our in-vitro work, we detected 14 (16%) of 88 mutations so far identified as being associated with clofazimine resistance and also seen in clinically resistant strains, and catalogued 35 new mutations. Structural modelling of Rv0678 showed four major mechanisms of bedaquiline resistance: impaired DNA binding, reduction in protein stability, disruption of protein dimerisation, and alteration in affinity for its fatty acid ligand. Interpretation: Our findings advance the understanding of drug resistance mechanisms in M tuberculosis complex strains. We have established an extended mutation catalogue, comprising variants implicated in resistance and susceptibility to bedaquiline and clofazimine. Our data emphasise that genotypic testing can delineate clinical isolates with borderline phenotypes, which is essential for the design of effective treatments. Funding: Leibniz ScienceCampus Evolutionary Medicine of the Lung, Deutsche Forschungsgemeinschaft, Research Training Group 2501 TransEvo, Rhodes Trust, Stanford University Medical Scientist Training Program, National Institute for Health and Care Research Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Bill & Melinda Gates Foundation, Wellcome Trust, and Marie Skłodowska-Curie Actions

    Bedaquiline and clofazimine resistance in Mycobacterium tuberculosis: an in-vitro and in-silico data analysis

    Get PDF
    Background Bedaquiline is a core drug for the treatment of multidrug-resistant tuberculosis; however, the understanding of resistance mechanisms is poor, which is hampering rapid molecular diagnostics. Some bedaquiline-resistant mutants are also cross-resistant to clofazimine. To decipher bedaquiline and clofazimine resistance determinants, we combined experimental evolution, protein modelling, genome sequencing, and phenotypic data. Methods For this in-vitro and in-silico data analysis, we used a novel in-vitro evolutionary model using subinhibitory drug concentrations to select bedaquiline-resistant and clofazimine-resistant mutants. We determined bedaquiline and clofazimine minimum inhibitory concentrations and did Illumina and PacBio sequencing to characterise selected mutants and establish a mutation catalogue. This catalogue also includes phenotypic and genotypic data of a global collection of more than 14 000 clinical Mycobacterium tuberculosis complex isolates, and publicly available data. We investigated variants implicated in bedaquiline resistance by protein modelling and dynamic simulations. Findings We discerned 265 genomic variants implicated in bedaquiline resistance, with 250 (94%) variants affecting the transcriptional repressor (Rv0678) of the MmpS5–MmpL5 efflux system. We identified 40 new variants in vitro, and a new bedaquiline resistance mechanism caused by a large-scale genomic rearrangement. Additionally, we identified in vitro 15 (7%) of 208 mutations found in clinical bedaquiline-resistant isolates. From our in-vitro work, we detected 14 (16%) of 88 mutations so far identified as being associated with clofazimine resistance and also seen in clinically resistant strains, and catalogued 35 new mutations. Structural modelling of Rv0678 showed four major mechanisms of bedaquiline resistance: impaired DNA binding, reduction in protein stability, disruption of protein dimerisation, and alteration in affinity for its fatty acid ligand. Interpretation Our findings advance the understanding of drug resistance mechanisms in M tuberculosis complex strains. We have established an extended mutation catalogue, comprising variants implicated in resistance and susceptibility to bedaquiline and clofazimine. Our data emphasise that genotypic testing can delineate clinical isolates with borderline phenotypes, which is essential for the design of effective treatments. Funding Leibniz ScienceCampus Evolutionary Medicine of the Lung, Deutsche Forschungsgemeinschaft, Research Training Group 2501 TransEvo, Rhodes Trust, Stanford University Medical Scientist Training Program, National Institute for Health and Care Research Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Bill & Melinda Gates Foundation, Wellcome Trust, and Marie Skłodowska-Curie Actions

    The Impact of Bioinformatics on Vaccine Design and Development

    Get PDF
    Vaccines are the pharmaceutical products that offer the best cost‐benefit ratio in the prevention or treatment of diseases. In that a vaccine is a pharmaceutical product, vaccine development and production are costly and it takes years for this to be accomplished. Several approaches have been applied to reduce the times and costs of vaccine development, mainly focusing on the selection of appropriate antigens or antigenic structures, carriers, and adjuvants. One of these approaches is the incorporation of bioinformatics methods and analyses into vaccine development. This chapter provides an overview of the application of bioinformatics strategies in vaccine design and development, supplying some successful examples of vaccines in which bioinformatics has furnished a cutting edge in their development. Reverse vaccinology, immunoinformatics, and structural vaccinology are described and addressed in the design and development of specific vaccines against infectious diseases caused by bacteria, viruses, and parasites. These include some emerging or re‐emerging infectious diseases, as well as therapeutic vaccines to fight cancer, allergies, and substance abuse, which have been facilitated and improved by using bioinformatics tools or which are under development based on bioinformatics strategies

    Semantic systems biology of prokaryotes : heterogeneous data integration to understand bacterial metabolism

    Get PDF
    The goal of this thesis is to improve the prediction of genotype to phenotypeassociations with a focus on metabolic phenotypes of prokaryotes. This goal isachieved through data integration, which in turn required the development ofsupporting solutions based on semantic web technologies. Chapter 1 providesan introduction to the challenges associated to data integration. Semantic webtechnologies provide solutions to some of these challenges and the basics ofthese technologies are explained in the Introduction. Furthermore, the ba-sics of constraint based metabolic modeling and construction of genome scalemodels (GEM) are also provided. The chapters in the thesis are separated inthree related topics: chapters 2, 3 and 4 focus on data integration based onheterogeneous networks and their application to the human pathogen M. tu-berculosis; chapters 5, 6, 7, 8 and 9 focus on the semantic web based solutionsto genome annotation and applications thereof; and chapter 10 focus on thefinal goal to associate genotypes to phenotypes using GEMs. Chapter 2 provides the prototype of a workflow to efficiently analyze in-formation generated by different inference and prediction methods. This me-thod relies on providing the user the means to simultaneously visualize andanalyze the coexisting networks generated by different algorithms, heteroge-neous data sets, and a suite of analysis tools. As a show case, we have ana-lyzed the gene co-expression networks of M. tuberculosis generated using over600 expression experiments. Hereby we gained new knowledge about theregulation of the DNA repair, dormancy, iron uptake and zinc uptake sys-tems. Furthermore, it enabled us to develop a pipeline to integrate ChIP-seqdat and a tool to uncover multiple regulatory layers. In chapter 3 the prototype presented in chapter 2 is further developedinto the Synchronous Network Data Integration (SyNDI) framework, whichis based on Cytoscape and Galaxy. The functionality and usability of theframework is highlighted with three biological examples. We analyzed thedistinct connectivity of plasma metabolites in networks associated with highor low latent cardiovascular disease risk. We obtained deeper insights froma few similar inflammatory response pathways in Staphylococcus aureus infec-tion common to human and mouse. We identified not yet reported regulatorymotifs associated with transcriptional adaptations of M. tuberculosis.In chapter 4 we present a review providing a systems level overview ofthe molecular and cellular components involved in divalent metal homeosta-sis and their role in regulating the three main virulence strategies of M. tu-berculosis: immune modulation, dormancy and phagosome escape. With theuse of the tools presented in chapter 2 and 3 we identified a single regulatorycascade for these three virulence strategies that respond to limited availabilityof divalent metals in the phagosome. The tools presented in chapter 2 and 3 achieve data integration throughthe use of multiple similarity, coexistence, coexpression and interaction geneand protein networks. However, the presented tools cannot store additional(genome) annotations. Therefore, we applied semantic web technologies tostore and integrate heterogeneous annotation data sets. An increasing num-ber of widely used biological resources are already available in the RDF datamodel. There are however, no tools available that provide structural overviewsof these resources. Such structural overviews are essential to efficiently querythese resources and to assess their structural integrity and design. There-fore, in chapter 5, I present RDF2Graph, a tool that automatically recoversthe structure of an RDF resource. The generated overview enables users tocreate complex queries on these resources and to structurally validate newlycreated resources. Direct functional comparison support genotype to phenotype predictions.A prerequisite for a direct functional comparison is consistent annotation ofthe genetic elements with evidence statements. However, the standard struc-tured formats used by the public sequence databases to present genome an-notations provide limited support for data mining, hampering comparativeanalyses at large scale. To enable interoperability of genome annotations fordata mining application, we have developed the Genome Biology OntologyLanguage (GBOL) and associated infrastructure (GBOL stack), which is pre-sented in chapter 6. GBOL is provenance aware and thus provides a consistentrepresentation of functional genome annotations linked to the provenance.The provenance of a genome annotation describes the contextual details andderivation history of the process that resulted in the annotation. GBOL is mod-ular in design, extensible and linked to existing ontologies. The GBOL stackof supporting tools enforces consistency within and between the GBOL defi-nitions in the ontology. Based on GBOL, we developed the genome annotation pipeline SAPP (Se-mantic Annotation Platform with Provenance) presented in chapter 7. SAPPautomatically predicts, tracks and stores structural and functional annotationsand associated dataset- and element-wise provenance in a Linked Data for-mat, thereby enabling information mining and retrieval with Semantic Webtechnologies. This greatly reduces the administrative burden of handling mul-tiple analysis tools and versions thereof and facilitates multi-level large scalecomparative analysis. In turn this can be used to make genotype to phenotypepredictions. The development of GBOL and SAPP was done simultaneously. Duringthe development we realized that we had to constantly validated the data ex-ported to RDF to ensure coherence with the ontology. This was an extremelytime consuming process and prone to error, therefore we developed the Em-pusa code generator. Empusa is presented in chapter 8. SAPP has been successfully used to annotate 432 sequenced Pseudomonas strains and integrate the resulting annotation in a large scale functional com-parison using protein domains. This comparison is presented in chapter 9.Additionally, data from six metabolic models, nearly a thousand transcrip-tome measurements and four large scale transposon mutagenesis experimentswere integrated with the genome annotations. In this way, we linked gene es-sentiality, persistence and expression variability. This gave us insight into thediversity, versatility and evolutionary history of the Pseudomonas genus, whichcontains some important pathogens as well some useful species for bioengi-neering and bioremediation purposes. Genome annotation can be used to create GEM, which can be used to betterlink genotypes to phenotypes. Bio-Growmatch, presented in chapter 10, istool that can automatically suggest modification to improve a GEM based onphenotype data. Thereby integrating growth data into the complete processof modelling the metabolism of an organism. Chapter 11 presents a general discussion on how the chapters contributedthe central goal. After which I discuss provenance requirements for data reuseand integration. I further discuss how this can be used to further improveknowledge generation. The acquired knowledge could, in turn, be used to de-sign new experiments. The principles of the dry-lab cycle and how semantictechnologies can contribute to establish these cycles are discussed in chapter11. Finally a discussion is presented on how to apply these principles to im-prove the creation and usability of GEM’s.</p

    Discovery, Characterization and Structural Studies of Inhibitors against Mycobacterium Tuberculosis Adenosine Kinase and Biotin Protein Ligase

    Get PDF
    The rapid emergence of drug-resistant Mycobacterium tuberculosis (Mtb) coupled to the high incidence of HIV-Mtb coinfection is of global concern. Consequently, there is a worldwide necessity to develop new drugs with novel mechanisms of action and new molecular targets. In this dissertation, we describe a crystallographic and high-throughput screening (HTS) approach towards the identification and structural characterization of inhibitors against Mycobacterium tuberculosis adenosine kinase (MtbAdoK) and biotin protein ligase (MtbBpL). Parallel studies were also performed to evaluate the in vitro potency and antimycobacterial profile of the compounds. Finally, X-ray crystallography was employed to investigate the structural basis of inhibition and to perform structure-guided drug design. In the first study, we focused on the biochemical, chemical synthesis and structural characterization of adenosine analogs as inhibitors of MtbAdoK. Here, we adopted a bottom-up structural approach towards the discovery, design, and synthesis of a series of compounds that displayed inhibitory constants ranging from 4.3-121.0 nM against the enzyme. Two of these compounds exhibited low micromolar activity against Mtb with 50.0 % minimum inhibitory concentrations of 1.7 and 4.0 µM. Our selectivity studies showed that the compounds display a higher degree of specificity of MtbAdoK when compared to the human enzyme (hAdoK). Finally, our crystallographic studies revealed the presence of a potentially therapeutically relevant cavity that is unique to the MtbAdoK homodimer. Next, we describe the discovery, biochemical and structural characterization of novel dihydro spiro compounds as inhibitors of MtbAdoK. Here, we utilized an HTS approach for the identification of the aforementioned compounds. Our enzymatic assays showed that the compounds are selective inhibitors of MtbAdoK when compared hAdoK. In addition, our antimycobacterial studies revealed that the compounds possess nanomolar potency against Mtb (500.0-810.0 nM). Finally, the crystallographic studies revealed that the inhibitors bind in a previously unknown pocket within the enzyme. Lastly, we explore the potential of MtbBpL as a drug target. Following identification of the compounds via HTS screening, we demonstrate that the inhibitors lacked any activity against human dermal fibroblast but possess antimycobacterial properties. Finally, steady-state kinetic experiments revealed that the compounds are noncompetitive inhibitors of the enzyme suggesting the presence of a previously uncharacterized allosteric site

    Evolutionary constraints on the complexity of genetic regulatory networks allow predictions of the total number of genetic interactions

    Full text link
    Genetic regulatory networks (GRNs) have been widely studied, yet there is a lack of understanding with regards to the final size and properties of these networks, mainly due to no network currently being complete. In this study, we analyzed the distribution of GRN structural properties across a large set of distinct prokaryotic organisms and found a set of constrained characteristics such as network density and number of regulators. Our results allowed us to estimate the number of interactions that complete networks would have, a valuable insight that could aid in the daunting task of network curation, prediction, and validation. Using state-of-the-art statistical approaches, we also provided new evidence to settle a previously stated controversy that raised the possibility of complete biological networks being random and therefore attributing the observed scale-free properties to an artifact emerging from the sampling process during network discovery. Furthermore, we identified a set of properties that enabled us to assess the consistency of the connectivity distribution for various GRNs against different alternative statistical distributions. Our results favor the hypothesis that highly connected nodes (hubs) are not a consequence of network incompleteness. Finally, an interaction coverage computed for the GRNs as a proxy for completeness revealed that high-throughput based reconstructions of GRNs could yield biased networks with a low average clustering coefficient, showing that classical targeted discovery of interactions is still needed.Comment: 28 pages, 5 figures, 12 pages supplementary informatio

    Proteins with Complex Architecture as Potential Targets for Drug Design: A Case Study of Mycobacterium tuberculosis

    Get PDF
    Lengthy co-evolution of Homo sapiens and Mycobacterium tuberculosis, the main causative agent of tuberculosis, resulted in a dramatically successful pathogen species that presents considerable challenge for modern medicine. The continuous and ever increasing appearance of multi-drug resistant mycobacteria necessitates the identification of novel drug targets and drugs with new mechanisms of action. However, further insights are needed to establish automated protocols for target selection based on the available complete genome sequences. In the present study, we perform complete proteome level comparisons between M. tuberculosis, mycobacteria, other prokaryotes and available eukaryotes based on protein domains, local sequence similarities and protein disorder. We show that the enrichment of certain domains in the genome can indicate an important function specific to M. tuberculosis. We identified two families, termed pkn and PE/PPE that stand out in this respect. The common property of these two protein families is a complex domain organization that combines species-specific regions, commonly occurring domains and disordered segments. Besides highlighting promising novel drug target candidates in M. tuberculosis, the presented analysis can also be viewed as a general protocol to identify proteins involved in species-specific functions in a given organism. We conclude that target selection protocols should be extended to include proteins with complex domain architectures instead of focusing on sequentially unique and essential proteins only
    corecore