266,224 research outputs found
Fuzzy virtual ligands for virtual screening
A new method to bridge the gap between ligand and receptor-based methods in virtual screening (VS) is presented. We introduce a structure-derived virtual ligand (VL) model as an extension to a previously published pseudo-ligand technique [1]: LIQUID [2] fuzzy pharmacophore virtual screening is combined with grid-based protein binding site predictions of PocketPicker [3]. This approach might help reduce bias introduced by manual selection of binding site residues and introduces pocket shape information to the VL. It allows for a combination of several protein structure models into a single "fuzzy" VL representation, which can be used to scan screening compound collections for ligand structures with a similar potential pharmacophore. PocketPicker employs an elaborate grid-based scanning procedure to determine buried cavities and depressions on the protein's surface. Potential binding sites are represented by clusters of grid probes characterizing the shape and accessibility of a cavity. A rule-based system is then applied to project reverse pharmacophore types onto the grid probes of a selected pocket. The pocket pharmacophore types are assigned depending on the properties and geometry of the protein residues surrounding the pocket with regard to their relative position towards the grid probes. LIQUID is used to cluster representative pocket probes by their pharmacophore types describing a fuzzy VL model. The VL is encoded in a correlation vector, which can then be compared to a database of pre-calculated ligand models. A retrospective screening using the fuzzy VL and several protein structures was evaluated by ten fold cross-validation with ROC-AUC and BEDROC metrics, obtaining a significant enrichment of actives. Future work will be devoted to prospective screening using a novel protein target of Helicobacter pylori and compounds from commercial providers
Ligand-based virtual screening using binary kernel discrimination
This paper discusses the use of a machine-learning technique called binary kernel discrimination (BKD) for virtual screening in drug- and pesticide-discovery programmes. BKD is compared with several other ligand-based tools for virtual screening in databases of 2D structures represented by fragment bit-strings, and is shown to provide an effective, and reasonably efficient, way of prioritising compounds for biological screening
Spherical harmonics coeffcients for ligand-based virtual screening of cyclooxygenase inhibitors
Background: Molecular descriptors are essential for many applications in computational chemistry, such as ligand-based similarity searching. Spherical harmonics have previously been suggested as comprehensive descriptors of molecular structure and properties. We investigate a spherical harmonics descriptor for shape-based virtual screening. Methodology/Principal Findings: We introduce and validate a partially rotation-invariant three-dimensional molecular shape descriptor based on the norm of spherical harmonics expansion coefficients. Using this molecular representation, we parameterize molecular surfaces, i.e., isosurfaces of spatial molecular property distributions. We validate the shape descriptor in a comprehensive retrospective virtual screening experiment. In a prospective study, we virtually screen a large compound library for cyclooxygenase inhibitors, using a self-organizing map as a pre-filter and the shape descriptor for candidate prioritization. Conclusions/Significance: 12 compounds were tested in vitro for direct enzyme inhibition and in a whole blood assay. Active compounds containing a triazole scaffold were identified as direct cyclooxygenase-1 inhibitors. This outcome corroborates the usefulness of spherical harmonics for representation of molecular shape in virtual screening of large compound collections. The combination of pharmacophore and shape-based filtering of screening candidates proved to be a straightforward approach to finding novel bioactive chemotypes with minimal experimental effort
Similarity-based virtual screening using 2D fingerprints
This paper summarises recent work at the University of Sheffield on virtual screening methods that use 2D fingerprint measures of structural similarity. A detailed comparison of a large number of similarity coefficients demonstrates that the well-known Tanimoto coefficient remains the method of choice for the computation of fingerprint-based similarity, despite possessing some inherent biases related to the sizes of the molecules that are being sought. Group fusion involves combining the results of similarity searches based on multiple reference structures and a single similarity measure. We demonstrate the effectiveness of this approach to screening, and also describe an approximate form of group fusion, turbo similarity searching, that can be used when just a single reference structure is available
Kernel learning for ligand-based virtual screening: discovery of a new PPARgamma agonist
Poster presentation at 5th German Conference on Cheminformatics: 23. CIC-Workshop Goslar, Germany. 8-10 November 2009 We demonstrate the theoretical and practical application of modern kernel-based machine learning methods to ligand-based virtual screening by successful prospective screening for novel agonists of the peroxisome proliferator-activated receptor gamma (PPARgamma) [1]. PPARgamma is a nuclear receptor involved in lipid and glucose metabolism, and related to type-2 diabetes and dyslipidemia. Applied methods included a graph kernel designed for molecular similarity analysis [2], kernel principle component analysis [3], multiple kernel learning [4], and, Gaussian process regression [5]. In the machine learning approach to ligand-based virtual screening, one uses the similarity principle [6] to identify potentially active compounds based on their similarity to known reference ligands. Kernel-based machine learning [7] uses the "kernel trick", a systematic approach to the derivation of non-linear versions of linear algorithms like separating hyperplanes and regression. Prerequisites for kernel learning are similarity measures with the mathematical property of positive semidefiniteness (kernels). The iterative similarity optimal assignment graph kernel (ISOAK) [2] is defined directly on the annotated structure graph, and was designed specifically for the comparison of small molecules. In our virtual screening study, its use improved results, e.g., in principle component analysis-based visualization and Gaussian process regression. Following a thorough retrospective validation using a data set of 176 published PPARgamma agonists [8], we screened a vendor library for novel agonists. Subsequent testing of 15 compounds in a cell-based transactivation assay [9] yielded four active compounds. The most interesting hit, a natural product derivative with cyclobutane scaffold, is a full selective PPARgamma agonist (EC50 = 10 ± 0.2 microM, inactive on PPARalpha and PPARbeta/delta at 10 microM). We demonstrate how the interplay of several modern kernel-based machine learning approaches can successfully improve ligand-based virtual screening results
Prediction by Nonparametric Posterior Estimation in Virtual Screening
The ability to rank molecules according to their effectiveness in some domain, e.g. pesticide, drug, is important owing to the cost of synthesising and testing chemical compounds. Virtual screening seeks to do this computationally with potential savings of millions of pounds and large profits associated with reduced time to market. Recently, binary kernel discrimination (BKD) is introduced and becoming popular in Chemoinformatics domain. It produces scores based on the estimated likelihood ratio of active to inactive compounds that are then ranked. The likelihoods are estimated through a Parzen Windows approach using the binomial distribution function (to accommodate binary descriptor or "fingerprint" vectors representing the presence, or not, of certain sub-structural arrangements of atoms) in place of the usual Gaussian choice. This research aims to compute the likelihood ratio via direct estimate of posterior probability by using non-parametric generalisation of logistic regression the so-called “Kernel Logistic Regression”. Furthermore, complexity is then controlled by penalising the likelihood function by Lq-norm. The compounds are then rank descending on the basis of posterior probability. The 11 activity classes from the MDL Drug Data Report (MDDR) database are used. The results are found to be less accurate than a currently leading approach but are still comparable in a number of cases
A Novel Scoring Based Distributed Protein Docking Application to Improve Enrichment
Molecular docking is a computational technique which predicts the binding energy and the preferred binding mode of a ligand to a protein target. Virtual screening is a tool which uses docking to investigate large chemical libraries to identify ligands that bind favorably to a protein target. We have developed a novel scoring based distributed protein docking application to improve enrichment in virtual screening. The application addresses the issue of time and cost of screening in contrast to conventional systematic parallel virtual screening methods in two ways. Firstly, it automates the process of creating and launching multiple independent dockings on a high performance computing cluster. Secondly, it uses a N˙ aive Bayes scoring function to calculate binding energy of un-docked ligands to identify and preferentially dock (Autodock predicted) better binders. The application was tested on four proteins using a library of 10,573 ligands. In all the experiments, (i). 200 of the 1000 best binders are identified after docking only 14% of the chemical library, (ii). 9 or 10 best-binders are identified after docking only 19% of the chemical library, and (iii). no significant enrichment is observed after docking 70% of the chemical library. The results show significant increase in enrichment of potential drug leads in early rounds of virtual screening
Evaluation of machine-learning methods for ligand-based virtual screening
Machine-learning methods can be used for virtual screening by analysing the structural characteristics of molecules of known (in)activity, and we here discuss the use of kernel discrimination and naive Bayesian classifier (NBC) methods for this purpose. We report a kernel method that allows the processing of molecules represented by binary, integer and real-valued descriptors, and show that it is little different in screening performance from a previously described kernel that had been developed specifically for the analysis of binary fingerprint representations of molecular structure. We then evaluate the performance of an NBC when the training-set contains only a very few active molecules. In such cases, a simpler approach based on group fusion would appear to provide superior screening performance, especially when structurally heterogeneous datasets are to be processed
In-silico Predictive Mutagenicity Model Generation Using Supervised Learning Approaches
With the advent of High Throughput Screening techniques, it is feasible to filter possible leads from a mammoth chemical space that can act against a particular target and inhibit its action. Virtual screening complements the in-vitro assays which are costly and time consuming. This process is used to sort biologically active molecules by utilizing the structural and chemical information of the compounds and the target proteins in order to screen potential hits. Various data mining and machine learning tools utilize Molecular Descriptors through the knowledge discovery process using classifier algorithms that classify the potentially active hits for the drug development process.

- …
