13 research outputs found

    Construction of Quantitative Structure Activity Relationship (QSAR) Models to Predict Potency of Structurally Diversed Janus Kinase 2 Inhibitors

    No full text
    Janus kinase 2 (JAK2) inhibitors represent a promising therapeutic class of anticancer agents against many myeloproliferative disorders. Bioactivity data on pIC 50 of 2229 JAK2 inhibitors were employed in the construction of quantitative structure-activity relationship (QSAR) models. The models were built from 100 data splits using decision tree (DT), support vector machine (SVM), deep neural network (DNN) and random forest (RF). The predictive power of RF models were assessed via 10-fold cross validation, which afforded excellent predictive performance with R 2 and RMSE of 0.74 ± 0.05 and 0.63 ± 0.05, respectively. Moreover, test set has excellent performance of R 2 (0.75 ± 0.03) and RMSE (0.62 ± 0.04). In addition, Y-scrambling was utilized to evaluate the possibility of chance correlation of the predictive model. A thorough analysis of the substructure fingerprint count was conducted to provide insights on the inhibitory properties of JAK2 inhibitors. Molecular cluster analysis revealed that pyrazine scaffolds have nanomolar potency against JAK2

    Construction of Quantitative Structure Activity Relationship (QSAR) Models to Predict Potency of Structurally Diversed Janus Kinase 2 Inhibitors.

    No full text
    Janus kinase 2 (JAK2) inhibitors represent a promising therapeutic class of anticancer agents against many myeloproliferative disorders. Bioactivity data on pIC 50 of 2229 JAK2 inhibitors were employed in the construction of quantitative structure-activity relationship (QSAR) models. The models were built from 100 data splits using decision tree (DT), support vector machine (SVM), deep neural network (DNN) and random forest (RF). The predictive power of RF models were assessed via 10-fold cross validation, which afforded excellent predictive performance with R 2 and RMSE of 0.74 ± 0.05 and 0.63 ± 0.05, respectively. Moreover, test set has excellent performance of R 2 (0.75 ± 0.03) and RMSE (0.62 ± 0.04). In addition, Y-scrambling was utilized to evaluate the possibility of chance correlation of the predictive model. A thorough analysis of the substructure fingerprint count was conducted to provide insights on the inhibitory properties of JAK2 inhibitors. Molecular cluster analysis revealed that pyrazine scaffolds have nanomolar potency against JAK2

    Characterizing the Relationship Between the Chemical Structures of Drugs and their Activities on Primary Cultures of Pediatric Solid Tumors

    No full text
    International audienceBackground: Despite continued efforts to develop new treatments, there is an urgent need to discover new drug leads to treat tumors exhibiting primary or secondary resistance to existing drugs. Cell cultures derived from patient-derived orthotopic xenografts are promising pre-clinical models to better predict drug response in cancer recurrence. Objective: The aim of the study was to investigate the relationship between the physiochemical properties of drugs and their in vitro potency as well as identifying chemical scaffolds biasedtowards selectivity or promiscuity of such drugs. Methods: The bioactivities of 158 drugs screened against cell cultures derived from 30 cancer orthotopic patient-derived xenograft (O-PDX) models were considered. Drugs were represented by physicochemical descriptors and chemical structure fingerprints. Supervised learning was employed to model the relationship between features and in vitro potency. Results: Drugs with in vitro potency for alveolar rhabdomyosarcoma and osteosarcoma tend to have a higher number of rings, two carbon-hetero bonds and halogens. Selective and promiscuous scaffolds for these phenotypic targets were identified. Highly-predictive models of in vitro potency were obtained across these 30 targets, which can be applied to unseen molecules via a webserver (https://rnewbie.shinyapps.io/Shobek-master). Conclusion: It is possible to identify privileged chemical scaffolds and predict the in vitro potency of unseen molecules across these 30 targets This information and models should be helpful to select which molecules to screen against these primary cultures of pediatric solid tumors

    Structure-based virtual screening for PDL1 dimerizers: evaluating generic scoring functions

    No full text
    An innovative mechanism to inhibit the PD1/PDL1 interaction is PDL1 dimerization induced by small-molecule PDL1 binders. Structure-based virtual screening is a promising approach to discovering such small-molecule PD1/PDL1 inhibitors. Here we investigate which type of generic scoring functions is most suitable to tackle this problem. We consider CNN-Score, an ensemble of convolutional neural networks, as the representative of machine-learning scoring functions. We also evaluate Smina, a commonly used classical scoring function, and IFP, a top structural fingerprint similarity scoring function. These three types of scoring functions were evaluated on two test sets sharing the same set of small-molecule PD1/PDL1 inhibitors, but using different types of inactives: either true inactives (molecules with no in vitro PD1/PDL1 inhibition activity) or assumed inactives (property-matched decoy molecules generated from each active). On both test sets, CNN-Score performed much better than Smina, which in turn strongly outperformed IFP. The fact that the latter was the case, despite precluding any possibility of exploiting decoy bias, demonstrates the predictive value of CNN-Score for PDL1. These results suggest that re-scoring Smina-docked molecules with CNN-Score is a promising structure-based virtual screening method to discover new small-molecule inhibitors of this important therapeutic target

    Structure-based virtual screening for PDL1 dimerizers: Evaluating generic scoring functions

    Get PDF
    The interaction between PD1 and its ligand PDL1 has been shown to render tumor cells resistant to apoptosis and promote tumor progression. An innovative mechanism to inhibit the PD1/PDL1 interaction is PDL1 dimerization induced by small-molecule PDL1 binders. Structure-based virtual screening is a promising approach to discovering such small-molecule PD1/PDL1 inhibitors. Here we investigate which type of generic scoring functions is most suitable to tackle this problem. We consider CNN-Score, an ensemble of convolutional neural networks, as the representative of machine-learning scoring functions. We also evaluate Smina, a commonly used classical scoring function, and IFP, a top structural fingerprint similarity scoring function. These three types of scoring functions were evaluated on two test sets sharing the same set of small-molecule PD1/PDL1 inhibitors, but using different types of inactives: either true inactives (molecules with no in vitro PD1/PDL1 inhibition activity) or assumed inactives (property-matched decoy molecules generated from each active). On both test sets, CNN-Score performed much better than Smina, which in turn strongly outperformed IFP. The fact that the latter was the case, despite precluding any possibility of exploiting decoy bias, demonstrates the predictive value of CNN-Score for PDL1. These results suggest that re-scoring Smina-docked molecules with CNN-Score is a promising structure based virtual screening method to discover new small-molecule inhibitors of this therapeutic target

    Towards reproducible computational drug discovery

    No full text
    The reproducibility of experiments has been a long standing impediment for further scientific progress. Computational methods have been instrumental in drug discovery efforts owing to its multifaceted utilization for data collection, pre-processing, analysis and inference. This article provides an in-depth coverage on the reproducibility of computational drug discovery. This review explores the following topics: (1) the current state-of-the-art on reproducible research, (2) research documentation (e.g. electronic laboratory notebook, Jupyter notebook, etc.), (3) science of reproducible research (i.e. comparison and contrast with related concepts as replicability, reusability and reliability), (4) model development in computational drug discovery, (5) computational issues on model development and deployment, (6) use case scenarios for streamlining the computational drug discovery protocol. In computational disciplines, it has become common practice to share data and programming codes used for numerical calculations as to not only facilitate reproducibility, but also to foster collaborations (i.e. to drive the project further by introducing new ideas, growing the data, augmenting the code, etc.). It is therefore inevitable that the field of computational drug design would adopt an open approach towards the collection, curation and sharing of data/code.Nalini Schaduangrat, Samuel Lampa and Saw Simeon contributed equally to this work.</p

    Predicting the Oligomeric States of Fluorescent Proteins

    No full text
    <p>Dataset and R source code from the article "Predicting the Oligomeric States of Fluorescent Proteins".</p

    osFP : a web server for predicting the oligomeric states of fluorescent proteins

    No full text
    Background: Currently, monomeric fluorescent proteins (FP) are ideal markers for protein tagging. The prediction of oligomeric states is helpful for enhancing live biomedical imaging. Computational prediction of FP oligomeric states can accelerate the effort of protein engineering efforts of creating monomeric FPs. To the best of our knowledge, this study represents the first computational model for predicting and analyzing FP oligomerization directly from the amino acid sequence. Results: After data curation, an exhaustive data set consisting of 397 non-redundant FP oligomeric states was compiled from the literature. Results from benchmarking of the protein descriptors revealed that the model built with amino acid composition descriptors was the top performing model with accuracy, sensitivity and specificity in excess of 80% and MCC greater than 0.6 for all three data subsets (e.g. training, tenfold cross-validation and external sets). The model provided insights on the important residues governing the oligomerization of FP. To maximize the benefit of the generated predictive model, it was implemented as a web server under the R programming environment. Conclusion: osFP affords a user-friendly interface that can be used to predict the oligomeric state of FP using the protein sequence. The advantage of osFP is that it is platform-independent meaning that it can be accessed via a web browser on any operating system and device. osFP is freely accessible at http://codes.bio/osfp/ while the source code and data set is provided on GitHub at https://github.com/chaninn/osFP/

    Probing the origins of human acetylcholinesterase inhibition via QSAR modeling and molecular docking

    Get PDF
    Alzheimer's disease (AD) is a chronic neurodegenerative disease which leads to the gradual loss of neuronal cells. Several hypotheses for AD exists (e.g., cholinergic, amyloid, tau hypotheses, etc.). As per the cholinergic hypothesis, the deficiency of choline is responsible for AD; therefore, the inhibition of AChE is a lucrative therapeutic strategy for the treatment of AD. Acetylcholinesterase (AChE) is an enzyme that catalyzes the breakdown of the neurotransmitter acetylcholine that is essential Ifor cognition arid memory. A large non-redundant data set of 2,570 compounds with reported IC50 values against AChE was obtained frorn ChEMBL and employed in quantitative structure-activity relationship (QSAR) study so as to gain insights on their origin of bioactivity. AChE inhibitors were described by a set of 12 fingerprint descriptors and predictive rnodels were constructed from 100 different data splits using random forest. Generated models afforded R-2, Q(cv)(2) and Q(Ext)(2) values in ranges of 0.66-0.93, 0.55-0.79 and 0.56-0.81 for the training set, 10-fold cross-validated set and lexternal set, respectively. The best model built using the substructure count was selected according to the OECD guidelines and it afforded R-2, Q(CV)(2) and Q(Ext)(2) values of 0.92 +/- 0.01, 0.78 +/- 0.06 and 018 +/- 0.05, respectively. Furthermore, IT-scrambling was applied to evaluate the possibility of chance correlation of the predictive model. Subsequently, a thorough analysis of the substructure fingerprint count was conducted to provide informative insights on the inhibitory activity of AChE inhibitors. Moreover Kennard Stone sampling of the actives were applied to select 30 diverse compounds for further molecular docking studies in order to gain structural insights on the origin of AChE inhibition. Site-moiety mapping of compounds from the diversity set revealed three binding anchors encompassing both hydrogen bonding and van der Waals interaction. Molecular docking revealed that compounds 13, 5 and 28 exhibited the lowest binding energies of -12.2, -12.0 and -12.0 kcal/mol, respectively, against human AChE, which is modulated by, hydrogen bonding, pi-pi stacking and hydrophobic interaction inside the binding pocket. These information may be used as guidelines for the design of novel and robust AChE inhibitors

    Origin of aromatase inhibitory activity via proteochemometric modeling

    Get PDF
    Aromatase, the rate-limiting enzyme that catalyzes the conversion of androgen to estrogen, plays an essential role in the development of estrogen-dependent breast cancer. Side effects due to aromatase inhibitors (AIs) necessitate the pursuit of novel inhibitor candidates with high selectivity, lower toxicity and increased potency. Designing a novel therapeutic agent against aromatase could be achieved computationally by means of ligand-based and structure-based methods. For over a decade, we have utilized both approaches to design potential AIs for which quantitative structure–activity relationships and molecular docking were used to explore inhibitory mechanisms of AIs towards aromatase. However, such approaches do not consider the effects that aromatase variants have on different AIs. In this study, proteochemometrics modeling was applied to analyze the interaction space between AIs and aromatase variants as a function of their substructural and amino acid features. Good predictive performance was achieved, as rigorously verified by 10-fold cross-validation, external validation, leave-one-compound-out cross-validation, leave-one-protein-out cross-validation and Y-scrambling tests. The investigations presented herein provide important insights into the mechanisms of aromatase inhibitory activity that could aid in the design of novel potent AIs as breast cancer therapeutic agents
    corecore