12 research outputs found

    Proteochemometric Modeling of the Susceptibility of Mutated Variants of the HIV-1 Virus to Reverse Transcriptase Inhibitors

    Get PDF
    BACKGROUND: Reverse transcriptase is a major drug target in highly active antiretroviral therapy (HAART) against HIV, which typically comprises two nucleoside/nucleotide analog reverse transcriptase (RT) inhibitors (NRTIs) in combination with a non-nucleoside RT inhibitor or a protease inhibitor. Unfortunately, HIV is capable of escaping the therapy by mutating into drug-resistant variants. Computational models that correlate HIV drug susceptibilities to the virus genotype and to drug molecular properties might facilitate selection of improved combination treatment regimens. METHODOLOGY/PRINCIPAL FINDINGS: We applied our earlier developed proteochemometric modeling technology to analyze HIV mutant susceptibility to the eight clinically approved NRTIs. The data set used covered 728 virus variants genotyped for 240 sequence residues of the DNA polymerase domain of the RT; 165 of these residues contained mutations; totally the data-set covered susceptibility data for 4,495 inhibitor-RT combinations. Inhibitors and RT sequences were represented numerically by 3D-structural and physicochemical property descriptors, respectively. The two sets of descriptors and their derived cross-terms were correlated to the susceptibility data by partial least-squares projections to latent structures. The model identified more than ten frequently occurring mutations, each conferring more than two-fold loss of susceptibility for one or several NRTIs. The most deleterious mutations were K65R, Q151M, M184V/I, and T215Y/F, each of them decreasing susceptibility to most of the NRTIs. The predictive ability of the model was estimated by cross-validation and by external predictions for new HIV variants; both procedures showed very high correlation between the predicted and actual susceptibility values (Q2=0.89 and Q2ext=0.86). The model is available at www.hivdrc.org as a free web service for the prediction of the susceptibility to any of the clinically used NRTIs for any HIV-1 mutant variant. CONCLUSIONS/SIGNIFICANCE: Our results give directions how to develop approaches for selection of genome-based optimum combination therapy for patients harboring mutated HIV variants

    Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets.

    Get PDF
    Background While a large body of work exists on comparing and benchmarking descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 amino acid descriptor sets have been benchmarked with respect to their ability of establishing bioactivity models. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI, BLOSUM, a novel protein descriptor set (termed ProtFP (4 variants)), and in addition we created and benchmarked three pairs of descriptor combinations. Prediction performance was evaluated in seven structure-activity benchmarks which comprise Angiotensin Converting Enzyme (ACE) dipeptidic inhibitor data, and three proteochemometric data sets, namely (1) GPCR ligands modeled against a GPCR panel, (2) enzyme inhibitors (NNRTIs) with associated bioactivities against a set of HIV enzyme mutants, and (3) enzyme inhibitors (PIs) with associated bioactivities on a large set of HIV enzyme mutants. Results The amino acid descriptor sets compared here show similar performance ( 0.3 log units RMSE difference and >0.7 difference in MCC). Combining different descriptor sets generally leads to better modeling performance than utilizing individual sets. The best performers were Z-scales (3) combined with ProtFP (Feature), or Z-Scales (3) combined with an average Z-Scale value for each target, while ProtFP (PCA8), ST-Scales, and ProtFP (Feature) rank last. Conclusions While amino acid descriptor sets capture different aspects of amino acids their ability to be used for bioactivity modeling is still – on average – surprisingly similar. Still, combining sets describing complementary information consistently leads to small but consistent improvement in modeling performance (average MCC 0.01 better, average RMSE 0.01 log units lower). Finally, performance differences exist between the targets compared thereby underlining that choosing an appropriate descriptor set is of fundamental for bioactivity modeling, both from the ligand- as well as the protein side

    Significantly Improved HIV Inhibitor Efficacy Prediction Employing Proteochemometric Models Generated From Antivirogram

    Get PDF
    Infection with HIV cannot currently be cured; however it can be controlled by combination treatment with multiple anti-retroviral drugs. Given different viral genotypes for virtually each individual patient, the question now arises which drug combination to use to achieve effective treatment. With the availability of viral genotypic data and clinical phenotypic data, it has become possible to create computational models able to predict an optimal treatment regimen for an individual patient. Current models are based only on sequence data derived from viral genotyping; chemical similarity of drugs is not considered. To explore the added value of chemical similarity inclusion we applied proteochemometric models, combining chemical and protein target properties in a single bioactivity model. Our dataset was a large scale clinical database of genotypic and phenotypic information (in total ca. 300,000 drug-mutant bioactivity data points, 4 (NNRTI), 8 (NRTI) or 9 (PI) drugs, and 10,700 (NNRTI) 10,500 (NRTI) or 27,000 (PI) mutants). Our models achieved a prediction error below 0.5 Log Fold Change. Moreover, when directly compared with previously published sequence data, derived models PCM performed better in resistance classification and prediction of Log Fold Change (0.76 log units versus 0.91). Furthermore, we were able to successfully confirm both known and identify previously unpublished, resistance-conferring mutations of HIV Reverse Transcriptase (e.g. K102Y, T216M) and HIV Protease (e.g. Q18N, N88G) from our dataset. Finally, we applied our models prospectively to the public HIV resistance database from Stanford University obtaining a correct resistance prediction rate of 84% on the full set (compared to 80% in previous work on a high quality subset). We conclude that proteochemometric models are able to accurately predict the phenotypic resistance based on genotypic data even for novel mutants and mixtures. Furthermore, we add an applicability domain to the prediction, informing the user about the reliability of predictions.Medicinal Chemistr

    Which Compound to Select in Lead Optimization? Prospectively Validated Proteochemometric Models Guide Preclinical Development

    Get PDF
    In quite a few diseases, drug resistance due to target variability poses a serious problem in pharmacotherapy. This is certainly true for HIV, and hence, it is often unknown which drug is best to use or to develop against an individual HIV strain. In this work we applied ‘proteochemometric’ modeling of HIV Non-Nucleoside Reverse Transcriptase (NNRTI) inhibitors to support preclinical development by predicting compound performance on multiple mutants in the lead selection stage. Proteochemometric models are based on both small molecule and target properties and can thus capture multi-target activity relationships simultaneously, the targets in this case being a set of 14 HIV Reverse Transcriptase (RT) mutants. We validated our model by experimentally confirming model predictions for 317 untested compound – mutant pairs, with a prediction error comparable with assay variability (RMSE 0.62). Furthermore, dependent on the similarity of a new mutant to the training set, we could predict with high accuracy which compound will be most effective on a sequence with a previously unknown genotype. Hence, our models allow the evaluation of compound performance on untested sequences and the selection of the most promising leads for further preclinical research. The modeling concept is likely to be applicable also to other target families with genetic variability like other viruses or bacteria, or with similar orthologs like GPCRs

    Proteochemometric Modeling of the Antigen-Antibody Interaction : New Fingerprints for Antigen, Antibody and Epitope-Paratope Interaction

    Get PDF
    Despite the high specificity between antigen and antibody binding, similar epitopes can be recognized or cross-neutralized by paratopes of antibody with different binding affinities. How to accurately characterize this slight variation which may or may not change the antigen-antibody binding affinity is a key issue in this area. In this report, by combining cylinder model with shell structure model, a new fingerprint was introduced to describe both the structural and physical-chemical features of the antigen and antibody protein. Furthermore, beside the description of individual protein, the specific epitope-paratope interaction fingerprint (EPIF) was developed to reflect the bond and the environment of the antigen-antibody interface. Finally, Proteochemometric Modeling of the antigen-antibody interaction was established and evaluated on 429 antigen-antibody complexes. By using only protein descriptors, our model achieved the best performance (R-2 = 0: 91; Q(test)(2) = 0: 68) among peers. Further, together with EPIF as a new cross-term, our model (R-2 = 0: 92; Q(2) test = 0: 74) can significantly outperform peers with multiplication of ligand and protein descriptors as a cross-term (R2Peer reviewe

    Linking the Resource Description Framework to cheminformatics and proteochemometrics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Semantic web technologies are finding their way into the life sciences. Ontologies and semantic markup have already been used for more than a decade in molecular sciences, but have not found widespread use yet. The semantic web technology Resource Description Framework (RDF) and related methods show to be sufficiently versatile to change that situation.</p> <p>Results</p> <p>The work presented here focuses on linking RDF approaches to existing molecular chemometrics fields, including cheminformatics, QSAR modeling and proteochemometrics. Applications are presented that link RDF technologies to methods from statistics and cheminformatics, including data aggregation, visualization, chemical identification, and property prediction. They demonstrate how this can be done using various existing RDF standards and cheminformatics libraries. For example, we show how IC<sub>50</sub> and K<it><sub>i</sub></it> values are modeled for a number of biological targets using data from the ChEMBL database.</p> <p>Conclusions</p> <p>We have shown that existing RDF standards can suitably be integrated into existing molecular chemometrics methods. Platforms that unite these technologies, like Bioclipse, makes this even simpler and more transparent. Being able to create and share workflows that integrate data aggregation and analysis (visual and statistical) is beneficial to interoperability and reproducibility. The current work shows that RDF approaches are sufficiently powerful to support molecular chemometrics workflows.</p

    Investigating the role of Gag in protease inhibitor susceptibility amongst West African HIV-1 subtypes

    Get PDF
    HIV-1 Gag contributes to susceptibility of protease inhibitors (PIs) in the absence of known resistance mutations in the protease gene. For the majority of HIV-infected patients worldwide, PIs are the second, and last-line of therapy. Clinically, only around 20% of individuals who fail PI regimen develop major resistance mutations in protease. We previously showed that full-length Gagprotease-derived phenotypic susceptibility to PIs differed between HIV-1 CRF02_AG and subtype G-infected patients who went on to successfully suppress viral replication versus those who experienced virological failure of boosted lopinavir monotherapy as first-line treatment in a clinical trial. We hypothesised therefore that baseline PI susceptibility by Gag-protease phenotyping could be used to predict treatment outcomes for patients on second line, boosted-PI treatment in the real-world clinical setting in Nigeria, where subtypes CRF02_AG/G dominate the epidemic. We used clinical and demographic data; HIV-1subtype, sex, age, viral load, duration of treatment and baseline CD4 count to match individuals who experienced second-line failure with ritonavir-boosted PI-based ART (‘baseline failures’) to those who achieved virological response (‘baseline successes’) with virological failure defined by viral load <400 copies of HIV-1 RNA/mL by week 48. Using a single replication-cycle assay, we carried out in vitro phenotypic susceptibility testing of patient-derived viruses from these two groups. We found no impact of baseline HIV-1 Gagprotease-derived phenotypic susceptibility on outcomes of PI-based second-line ART, treatment outcome could not be predicted using baseline susceptibility alone. Secondly, we sought to explore the role of mutation in Gag-protease genotypic and phenotypic changes within patients who failed PI-based regimens without known drug resistance-associated protease mutations in order to identify novel determinants of PI resistance. We used longitudinal samples collected at baseline, and at virological failure to explore the role of Gag mutations. Using target enrichment and next-generation sequencing (NGS), followed by haplotype reconstruction and phenotypic drug assays and phylogenetic analysis, we reported for the first time a four-amino acid mutation signature in HIV-1, CRF02_AG matrix (S126del, H127del, T122A and G123E) which confer reduced susceptibility to the PI, lopinavir and atazanavir. Our multi-pronged genotypic and phenotypic approach to document emergence and temporal dynamics of a novel protease inhibitor resistance signature in HIV- 1 matrix domain reveals the interplay between Gag associated resistance and fitness

    Drug Repurposing

    Get PDF
    This book focuses on various aspects and applications of drug repurposing, the understanding of which is important for treating diseases. Due to the high costs and time associated with the new drug discovery process, the inclination toward drug repurposing is increasing for common as well as rare diseases. A major focus of this book is understanding the role of drug repurposing to develop drugs for infectious diseases, including antivirals, antibacterial and anticancer drugs, as well as immunotherapeutics

    IN SILICO METHODS FOR DRUG DESIGN AND DISCOVERY

    Get PDF
    Computer-aided drug design (CADD) methodologies are playing an ever-increasing role in drug discovery that are critical in the cost-effective identification of promising drug candidates. These computational methods are relevant in limiting the use of animal models in pharmacological research, for aiding the rational design of novel and safe drug candidates, and for repositioning marketed drugs, supporting medicinal chemists and pharmacologists during the drug discovery trajectory.Within this field of research, we launched a Research Topic in Frontiers in Chemistry in March 2019 entitled “In silico Methods for Drug Design and Discovery,” which involved two sections of the journal: Medicinal and Pharmaceutical Chemistry and Theoretical and Computational Chemistry. For the reasons mentioned, this Research Topic attracted the attention of scientists and received a large number of submitted manuscripts. Among them 27 Original Research articles, five Review articles, and two Perspective articles have been published within the Research Topic. The Original Research articles cover most of the topics in CADD, reporting advanced in silico methods in drug discovery, while the Review articles offer a point of view of some computer-driven techniques applied to drug research. Finally, the Perspective articles provide a vision of specific computational approaches with an outlook in the modern era of CADD
    corecore