657 research outputs found

    A statistical framework to evaluate virtual screening

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Receiver operating characteristic (ROC) curve is widely used to evaluate virtual screening (VS) studies. However, the method fails to address the "early recognition" problem specific to VS. Although many other metrics, such as RIE, BEDROC, and pROC that emphasize "early recognition" have been proposed, there are no rigorous statistical guidelines for determining the thresholds and performing significance tests. Also no comparisons have been made between these metrics under a statistical framework to better understand their performances.</p> <p>Results</p> <p>We have proposed a statistical framework to evaluate VS studies by which the threshold to determine whether a ranking method is better than random ranking can be derived by bootstrap simulations and 2 ranking methods can be compared by permutation test. We found that different metrics emphasize "early recognition" differently. BEDROC and RIE are 2 statistically equivalent metrics. Our newly proposed metric SLR is superior to pROC. Through extensive simulations, we observed a "seesaw effect" – overemphasizing early recognition reduces the statistical power of a metric to detect true early recognitions.</p> <p>Conclusion</p> <p>The statistical framework developed and tested by us is applicable to any other metric as well, even if their exact distribution is unknown. Under this framework, a threshold can be easily selected according to a pre-specified type I error rate and statistical comparisons between 2 ranking methods becomes possible. The theoretical null distribution of SLR metric is available so that the threshold of SLR can be exactly determined without resorting to bootstrap simulations, which makes it easy to use in practical virtual screening studies.</p

    Systematic Exploitation of Multiple Receptor Conformations for Virtual Ligand Screening

    Get PDF
    The role of virtual ligand screening in modern drug discovery is to mine large chemical collections and to prioritize for experimental testing a comparatively small and diverse set of compounds with expected activity against a target. Several studies have pointed out that the performance of virtual ligand screening can be improved by taking into account receptor flexibility. Here, we systematically assess how multiple crystallographic receptor conformations, a powerful way of discretely representing protein plasticity, can be exploited in screening protocols to separate binders from non-binders. Our analyses encompass 36 targets of pharmaceutical relevance and are based on actual molecules with reported activity against those targets. The results suggest that an ensemble receptor-based protocol displays a stronger discriminating power between active and inactive molecules as compared to its standard single rigid receptor counterpart. Moreover, such a protocol can be engineered not only to enrich a higher number of active compounds, but also to enhance their chemical diversity. Finally, some clear indications can be gathered on how to select a subset of receptor conformations that is most likely to provide the best performance in a real life scenario

    In Silico Evaluation of Ibuprofen and Two Benzoylpropionic Acid Derivatives with Potential Anti-Inflammatory Activity

    Get PDF
    Inflammation is a complex reaction involving cellular and molecular components and an unspecific response to a specific aggression. The use of scientific and technological innovations as a research tool combining multidisciplinary knowledge in informatics, biotechnology, chemistry and biology are essential for optimizing time and reducing costs in the drug design. Thus, the integration of these in silico techniques makes it possible to search for new anti-inflammatory drugs with better pharmacokinetic and toxicological profiles compared to commercially used drugs. This in silico study evaluated the anti-inflammatory potential of two benzoylpropionic acid derivatives (MBPA and DHBPA) using molecular docking and their thermodynamic profiles by molecular dynamics, in addition to predicting oral bioavailability, bioactivity and toxicity. In accordance to our predictions the derivatives proposed here had the potential capacity for COX-2 inhibition in the human and mice enzyme, due to containing similar interactions with the control compound (ibuprofen). Ibuprofen showed toxic predictions of hepatotoxicity (in human, mouse and rat; toxicophoric group 2-arylacetic or 3-arylpropionic acid) and irritation of the gastrointestinal tract (in human, mouse and rat; toxicophoric group alpha-substituted propionic acid or ester) confirming the literature data, as well as the efficiency of the DEREK 10.0.2 program. Moreover, the proposed compounds are predicted to have a good oral bioavailability profile and low toxicity (LD50 < 700 mg/kg) and safety when compared to the commercial compound. Therefore, future studies are necessary to confirm the anti-inflammatory potential of these compounds

    Evaluation of cross-validation strategies in sequence-based binding prediction using deep learning

    Get PDF
    Binding prediction between targets and drug-like compounds through deep neural networks has generated promising results in recent years, outperforming traditional machine learning-based methods. However, the generalization capability of these classification models is still an issue to be addressed. In this work, we explored how different cross-validation strategies applied to data from different molecular databases affect to the performance of binding prediction proteochemometrics models. These strategies are (1) random splitting, (2) splitting based on K-means clustering (both of actives and inactives), (3) splitting based on source database, and (4) splitting based both in the clustering and in the source database. These schemas are applied to a deep learning proteochemometrics model and to a simple logistic regression model to be used as baseline. Additionally, two different ways of describing molecules in the model are tested: (1) by their SMILES and (2) by three fingerprints. The classification performance of our deep learning-based proteochemometrics model is comparable to the state of the art. Our results show that the lack of generalization of these models is due to a bias in public molecular databases and that a restrictive cross-validation schema based on compound clustering leads to worse but more robust and credible results. Our results also show better performance when representing molecules by their fingerprints.Peer ReviewedPostprint (author's final draft

    Application of 3D Zernike descriptors to shape-based ligand similarity searching

    Get PDF
    Background: The identification of promising drug leads from a large database of compounds is an important step in the preliminary stages of drug design. Although shape is known to play a key role in the molecular recognition process, its application to virtual screening poses significant hurdles both in terms of the encoding scheme and speed. Results: In this study, we have examined the efficacy of the alignment independent three-dimensional Zernike descriptor (3DZD) for fast shape based similarity searching. Performance of this approach was compared with several other methods including the statistical moments based ultrafast shape recognition scheme (USR) and SIMCOMP, a graph matching algorithm that compares atom environments. Three benchmark datasets are used to thoroughly test the methods in terms of their ability for molecular classification, retrieval rate, and performance under the situation that simulates actual virtual screening tasks over a large pharmaceutical database. The 3DZD performed better than or comparable to the other methods examined, depending on the datasets and evaluation metrics used. Reasons for the success and the failure of the shape based methods for specific cases are investigated. Based on the results for the three datasets, general conclusions are drawn with regard to their efficiency and applicability

    RANDOM WALK APPLIED TO HETEROGENOUS DRUG-TARGET NETWORKS FOR PREDICTING BIOLOGICAL OUTCOMES

    Get PDF
    Thesis (Ph.D.) - Indiana University, Informatics and Computing, 2016Prediction of unknown drug target interactions from bioassay data is critical not only for the understanding of various interactions but also crucial for the development of new drugs and repurposing of old ones. Conventional methods for prediction of such interactions can be divided into 2D based and 3D based methods. 3D methods are more CPU expensive and require more manual interpretation whereas 2D methods are actually fast methods like machine learning and similarity search which use chemical fingerprints. One of the problems of using traditional machine learning based method to predict drug-target pairs is that it requires a labeled information of true and false interactions. One of the major problems of supervised learning methods is selection on negative samples. Unknown drug target interactions are regarded as false interactions, which may influence the predictive accuracy of the model. To overcome this problem network based methods has become an effective tool in predicting the drug target interactions overcoming the negative sampling problem. In this dissertation study, I will describe traditional machine learning methods and 3D methods of pharmacophore modeling for drug target prediction and will show how these methods work in a drug discovery scenario. I will then introduce a new framework for drug target prediction based on bipartite networks of drug target relations known as Random Walk with Restart (RWR). RWR integrates various networks including drug– drug similarity networks, protein-protein similarity networks and drug- target interaction networks into a heterogeneous network that is capable of predicting novel drug-target relations. I will describe how chemical features for measuring drug-drug similarity do not affect performance in predicting interactions and further show the performance of RWR using an external dataset from ChEMBL database. I will describe about further implementations of RWR approach into multilayered networks consisting of biological data like diseases, tissue based gene expression data, protein- complexes and metabolic pathways to predict associations between human diseases and metabolic pathways which are very crucial in drug discovery. I have further developed a software tool package netpredictor in R (standalone and the web) for unipartite and bipartite networks and implemented network-based predictive algorithms and network properties for drug-target prediction. This package will be described

    Modelling of serotonergic receptors and molecular optimization of X-ray crystal structures of serotonin transporter and their interactions with exogenous compounds

    Get PDF
    The serotonin (5-hydroxytryptamine, 5-HT) receptors and transporter are in the serotonergic neurotransmission system, and believed to have a major role in pathology of depression. They are of pharmacological importance, being targeted by many nowadays antidepressants. It is therefore of great interest to understand their structural and functional properties for development of future drugs. There is generally little knowledge today about the effects of environmental toxicants on the human brain. If the exogenous compounds interact with the serotonin receptors and transporter, they may interfere with the serotonergic neurotransmission in the brain and interfere with the effects of the CNS drugs. Homology modelling is an in silico method used for prediction of the 3D structure of structurally unknown proteins. Models of serotonergic receptors (5-HT1A, 5-HT2A, 5-HT2C) were constructed by the homology approach with known structures in the PDB. The newly released X-ray crystal structures of the human serotonin transporter (SERT) were also imported from the PDB and optimized with molecular modelling techniques. Molecular docking was utilized to predict putative harmful effects and drug interactions of the toxicants in the Tox21 database with these protein targets. Many toxic compounds were predicted to interact with serotonergic receptors and the SERT and many of these had physiochemical properties that suggest that they may act in the CNS. Detailed interaction analysis of the selected compounds of serotonergic receptors and the SERT indicated that besides the crucial interaction with an aspartic acid, aromatic interactions with phenylalanine residues are also very important. The obtained high CNS MPO scores and similar Glide scores between the known high affinity binders and toxicants could suggest harmful effects and drug interactions in serotonergic system of the CNS

    Fuzzy virtual ligands for virtual screening

    Get PDF
    A new method to bridge the gap between ligand and receptor-based methods in virtual screening (VS) is presented. We introduce a structure-derived virtual ligand (VL) model as an extension to a previously published pseudo-ligand technique [1]: LIQUID [2] fuzzy pharmacophore virtual screening is combined with grid-based protein binding site predictions of PocketPicker [3]. This approach might help reduce bias introduced by manual selection of binding site residues and introduces pocket shape information to the VL. It allows for a combination of several protein structure models into a single "fuzzy" VL representation, which can be used to scan screening compound collections for ligand structures with a similar potential pharmacophore. PocketPicker employs an elaborate grid-based scanning procedure to determine buried cavities and depressions on the protein's surface. Potential binding sites are represented by clusters of grid probes characterizing the shape and accessibility of a cavity. A rule-based system is then applied to project reverse pharmacophore types onto the grid probes of a selected pocket. The pocket pharmacophore types are assigned depending on the properties and geometry of the protein residues surrounding the pocket with regard to their relative position towards the grid probes. LIQUID is used to cluster representative pocket probes by their pharmacophore types describing a fuzzy VL model. The VL is encoded in a correlation vector, which can then be compared to a database of pre-calculated ligand models. A retrospective screening using the fuzzy VL and several protein structures was evaluated by ten fold cross-validation with ROC-AUC and BEDROC metrics, obtaining a significant enrichment of actives. Future work will be devoted to prospective screening using a novel protein target of Helicobacter pylori and compounds from commercial providers

    Development of Potential Multi-Target Inhibitors for Human Cholinesterases and Beta-Secretase 1: A Computational Approach

    Get PDF
    Alzheimer’s disease causes chronic neurodegeneration and is the leading cause of dementia in the world. The causes of this disease are not fully understood but seem to involve two essential cerebral pathways: cholinergic and amyloid. The simultaneous inhibition of AChE, BuChE, and BACE-1, essential enzymes involved in those pathways, is a promising therapeutic approach to treat the symptoms and, hopefully, also halt the disease progression. This study sought to identify triple enzymatic inhibitors based on stereo-electronic requirements deduced from molecular modeling of AChE, BuChE, and BACE-1 active sites. A pharmacophore model was built, displaying four hydrophobic centers, three hydrogen bond acceptors, and one positively charged nitrogen, and used to prioritize molecules found in virtual libraries. Compounds showing adequate overlapping rates with the pharmacophore were subjected to molecular docking against the three enzymes and those with an adequate docking score (n = 12) were evaluated for physicochemical and toxicological parameters and commercial availability. The structure exhibiting the greatest inhibitory potential against all three enzymes was subjected to molecular dynamics simulations (100 ns) to assess the stability of the inhibitor-enzyme systems. The results of this in silico approach indicate ZINC1733 can be a potential multi-target inhibitor of AChE, BuChE, and BACE-1, and future enzymatic assays are planned to validate those results.PPBE and PPGCF/UEFS; Fundação de Amparo à Pesquisa do Estado de Minas Gerais—FAPEMIG, grants APQ-02741-17, APQ-00855-19, APQ-01733-21, and APQ-04559-22Conselho Nacional de Desenvolvimento Científico e Tecnológico—CNPq-Brazil, grants 305117/2017-3, 426261/2018-6Fellowship of 2021 (grant 310108/2020-9

    What do we know and when do we know it?

    Get PDF
    Two essential aspects of virtual screening are considered: experimental design and performance metrics. In the design of any retrospective virtual screen, choices have to be made as to the purpose of the exercise. Is the goal to compare methods? Is the interest in a particular type of target or all targets? Are we simulating a ‘real-world’ setting, or teasing out distinguishing features of a method? What are the confidence limits for the results? What should be reported in a publication? In particular, what criteria should be used to decide between different performance metrics? Comparing the field of molecular modeling to other endeavors, such as medical statistics, criminology, or computer hardware evaluation indicates some clear directions. Taken together these suggest the modeling field has a long way to go to provide effective assessment of its approaches, either to itself or to a broader audience, but that there are no technical reasons why progress cannot be made
    corecore