23 research outputs found
Similarity Methods in Chemoinformatics
promoting access to White Rose research paper
Evaluation of a Bayesian inference network for ligand-based virtual screening
Background
Bayesian inference networks enable the computation of the probability that an event will occur. They have been used previously to rank textual documents in order of decreasing relevance to a user-defined query. Here, we modify the approach to enable a Bayesian inference network to be used for chemical similarity searching, where a database is ranked in order of decreasing probability of bioactivity.
Results
Bayesian inference networks were implemented using two different types of network and four different types of belief function. Experiments with the MDDR and WOMBAT databases show that a Bayesian inference network can be used to provide effective ligand-based screening, especially when the active molecules being sought have a high degree of structural homogeneity; in such cases, the network substantially out-performs a conventional, Tanimoto-based similarity searching system. However, the effectiveness of the network is much less when structurally heterogeneous sets of actives are being sought.
Conclusion
A Bayesian inference network provides an interesting alternative to existing tools for ligand-based virtual screening
Clustering files of chemical structures using the Szekely-Rizzo generalization of Ward's method
Ward's method is extensively used for clustering chemical structures represented by 2D fingerprints. This paper compares Ward clusterings of 14 datasets (containing between 278 and 4332 molecules) with those obtained using the Szekely–Rizzo clustering method, a generalization of Ward's method. The clusters resulting from these two methods were evaluated by the extent to which the various classifications were able to group active molecules together, using a novel criterion of clustering effectiveness. Analysis of a total of 1400 classifications (Ward and Székely–Rizzo clustering methods, 14 different datasets, 5 different fingerprints and 10 different distance coefficients) demonstrated the general superiority of the Székely–Rizzo method. The distance coefficient first described by Soergel performed extremely well in these experiments, and this was also the case when it was used in simulated virtual screening experiments
The use of 2D fingerprint methods to support the assessment of structural similarity in orphan drug legislation.
In the European Union, medicines are authorised for some rare disease only if they are judged to be dissimilar to authorised orphan drugs for that disease. This paper describes the use of 2D fingerprints to show the extent of the relationship between computed levels of structural similarity for pairs of molecules and expert judgments of the similarities of those pairs. The resulting relationship can be used to provide input to the assessment of new active compounds for which orphan drug authorisation is being sought
Encountering on the road to Serendip? Browsing in new information environments
Considers the continuing relevance of the ideas of browsing, serendipity, information encountering, and literature discovery in a digital information environment
Discovery of Potential Orthosteric and Allosteric Antagonists of P2Y1R from Chinese Herbs by Molecular Simulation Methods
P2Y1 receptor (P2Y1R), which belongs to G protein-coupled receptors (GPCRs), is an important target in ADP-induced platelet aggregation. The crystal structure of P2Y1R has been solved recently, which revealed orthosteric and allosteric ligand-binding sites with the details of ligand-protein binding modes. And it suggests that P2Y1R antagonists, which recognize two distinct sites, could potentially provide an efficacious and safe antithrombotic profile. In present paper, 2D similarity search, pharmacophore based screening, and molecular docking were used to explore the potential natural P2Y1R antagonists. 2D similarity search was used to classify orthosteric and allosteric antagonists of P2Y1R. Based on the result, pharmacophore models were constructed and validated by the test set. Optimal models were selected to discover potential P2Y1R antagonists of orthosteric and allosteric sites from Traditional Chinese Medicine (TCM). And the hits were filtered by Lipinski’s rule. Then molecular docking was used to refine the results of pharmacophore based screening and analyze the binding mode of the hits and P2Y1R. Finally, two orthosteric and one allosteric potential compounds were obtained, which might be used in future P2Y1R antagonists design. This work provides a reliable guide for discovering natural P2Y1R antagonists acting on two distinct sites from TCM
Recommended from our members
Discovering Highly Potent Molecules from an Initial Set of Inactives Using Iterative Screening.
The versatility of similarity searching and quantitative structure-activity relationships to model the activity of compound sets within given bioactivity ranges (i.e., interpolation) is well established. However, their relative performance in the common scenario in early stage drug discovery where lots of inactive data but no active data points are available (i.e., extrapolation from the low-activity to the high-activity range) has not been thoroughly examined yet. To this aim, we have designed an iterative virtual screening strategy which was evaluated on 25 diverse bioactivity data sets from ChEMBL. We benchmark the efficiency of random forest (RF), multiple linear regression, ridge regression, similarity searching, and random selection of compounds to identify a highly active molecule in the test set among a large number of low-potency compounds. We use the number of iterations required to find this active molecule to evaluate the performance of each experimental setup. We show that linear and ridge regression often outperform RF and similarity searching, reducing the number of iterations to find an active compound by a factor of 2 or more. Even simple regression methods seem better able to extrapolate to high-bioactivity ranges than RF, which only provides output values in the range covered by the training set. In addition, examination of the scaffold diversity in the data sets used shows that in some cases similarity searching and RF require two times as many iterations as random selection depending on the chemical space covered in the initial training data. Lastly, we show using bioactivity data for COX-1 and COX-2 that our framework can be extended to multitarget drug discovery, where compounds are selected by concomitantly considering their activity against multiple targets. Overall, this study provides an approach for iterative screening where only inactive data are present in early stages of drug discovery in order to discover highly potent compounds and the best experimental set up in which to do so.This project has received funding from the European Union’s Framework Programme For Research and Innovation Horizon 2020 (2014–2020) under the Marie Curie Sklodowska-Curie Grant Agreement No. 703543 (I.C.-C.). A.B. thanks the European Research Commission (Starting Grant ERC-2013-StG 336159 MIXTURE) for funding. N.C.F is funded by EPSRC (EP/M006093/1)