479 research outputs found
Study of ligand-based virtual screening tools in computer-aided drug design
Virtual screening is a central technique in drug discovery today. Millions of molecules can be tested in silico with the aim to only select the most promising and test them experimentally. The topic of this thesis is ligand-based virtual screening tools which take existing active molecules as starting point for finding new drug candidates.
One goal of this thesis was to build a model that gives the probability that two molecules are biologically similar as function of one or more chemical similarity scores. Another important goal was to evaluate how well different ligand-based virtual screening tools are able to distinguish active molecules from inactives. One more criterion set for the virtual screening tools was their applicability in scaffold-hopping, i.e. finding new active chemotypes.
In the first part of the work, a link was defined between the abstract chemical similarity score given by a screening tool and the probability that the two molecules are biologically similar. These results help to decide objectively which virtual screening hits to test experimentally. The work also resulted in a new type of data fusion method when using two or more tools. In the second part, five ligand-based virtual screening tools were evaluated and their performance was found to be generally poor. Three reasons for this were proposed: false negatives in the benchmark sets, active molecules that do not share the binding mode, and activity cliffs. In the third part of the study, a novel visualization and quantification method is presented for evaluation of the scaffold-hopping ability of virtual screening tools.Siirretty Doriast
Recommended from our members
Comparison of structure- and ligand-based scoring functions for deep generative models: a GPCR case study.
Deep generative models have shown the ability to devise both valid and novel chemistry, which could significantly accelerate the identification of bioactive compounds. Many current models, however, use molecular descriptors or ligand-based predictive methods to guide molecule generation towards a desirable property space. This restricts their application to relatively data-rich targets, neglecting those where little data is available to sufficiently train a predictor. Moreover, ligand-based approaches often bias molecule generation towards previously established chemical space, thereby limiting their ability to identify truly novel chemotypes. In this work, we assess the ability of using molecular docking via Glide-a structure-based approach-as a scoring function to guide the deep generative model REINVENT and compare model performance and behaviour to a ligand-based scoring function. Additionally, we modify the previously published MOSES benchmarking dataset to remove any induced bias towards non-protonatable groups. We also propose a new metric to measure dataset diversity, which is less confounded by the distribution of heavy atom count than the commonly used internal diversity metric. With respect to the main findings, we found that when optimizing the docking score against DRD2, the model improves predicted ligand affinity beyond that of known DRD2 active molecules. In addition, generated molecules occupy complementary chemical and physicochemical space compared to the ligand-based approach, and novel physicochemical space compared to known DRD2 active molecules. Furthermore, the structure-based approach learns to generate molecules that satisfy crucial residue interactions, which is information only available when taking protein structure into account. Overall, this work demonstrates the advantage of using molecular docking to guide de novo molecule generation over ligand-based predictors with respect to predicted affinity, novelty, and the ability to identify key interactions between ligand and protein target. Practically, this approach has applications in early hit generation campaigns to enrich a virtual library towards a particular target, and also in novelty-focused projects, where de novo molecule generation either has no prior ligand knowledge available or should not be biased by it
Recommended from our members
Comparison of structure- and ligand-based scoring functions for deep generative models: a GPCR case study.
Deep generative models have shown the ability to devise both valid and novel chemistry, which could significantly accelerate the identification of bioactive compounds. Many current models, however, use molecular descriptors or ligand-based predictive methods to guide molecule generation towards a desirable property space. This restricts their application to relatively data-rich targets, neglecting those where little data is available to sufficiently train a predictor. Moreover, ligand-based approaches often bias molecule generation towards previously established chemical space, thereby limiting their ability to identify truly novel chemotypes. In this work, we assess the ability of using molecular docking via Glide-a structure-based approach-as a scoring function to guide the deep generative model REINVENT and compare model performance and behaviour to a ligand-based scoring function. Additionally, we modify the previously published MOSES benchmarking dataset to remove any induced bias towards non-protonatable groups. We also propose a new metric to measure dataset diversity, which is less confounded by the distribution of heavy atom count than the commonly used internal diversity metric. With respect to the main findings, we found that when optimizing the docking score against DRD2, the model improves predicted ligand affinity beyond that of known DRD2 active molecules. In addition, generated molecules occupy complementary chemical and physicochemical space compared to the ligand-based approach, and novel physicochemical space compared to known DRD2 active molecules. Furthermore, the structure-based approach learns to generate molecules that satisfy crucial residue interactions, which is information only available when taking protein structure into account. Overall, this work demonstrates the advantage of using molecular docking to guide de novo molecule generation over ligand-based predictors with respect to predicted affinity, novelty, and the ability to identify key interactions between ligand and protein target. Practically, this approach has applications in early hit generation campaigns to enrich a virtual library towards a particular target, and also in novelty-focused projects, where de novo molecule generation either has no prior ligand knowledge available or should not be biased by it
One Scaffold, Three Binding Modes: Novel and Selective Pteridine Reductase 1 Inhibitors Derived from Fragment Hits Discovered by Virtual Screeningâ€
The enzyme pteridine reductase 1 (PTR1) is a potential target for new compounds to treat human African trypanosomiasis. A virtual screening campaign for fragments inhibiting PTR1 was carried out. Two novel chemical series were identified containing aminobenzothiazole and aminobenzimidazole scaffolds, respectively. One of the hits (2-amino-6-chloro-benzimidazole) was subjected to crystal structure analysis and a high resolution crystal structure in complex with PTR1 was obtained, confirming the predicted binding mode. However, the crystal structures of two analogues (2-amino-benzimidazole and 1-(3,4-dichloro-benzyl)-2-amino-benzimidazole) in complex with PTR1 revealed two alternative binding modes. In these complexes, previously unobserved protein movements and water-mediated protein-ligand contacts occurred, which prohibited a correct prediction of the binding modes. On the basis of the alternative bindingmode of 1-(3,4-dichloro-benzyl)-2-amino-benzimidazole, derivatives were designed and selective PTR1 inhibitors with low nanomolar potency and favorable physicochemical properties were obtained
BINARY QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIP ANALYSIS IN RETROSPECTIVE STRUCTURE-BASED VIRTUAL SCREENING CAMPAIGNS TARGETING ESTROGEN RECEPTOR ALPHA
Ă‚Â Ă‚Â Objective: The objective of this study is to construct predictive unbiased structure-based virtual screening (SBVS) protocols to identify potent ligands for estrogen receptor alpha by combining molecular docking, protein-ligand interaction fingerprinting (PLIF), and binary quantitative structure-activity relationship (QSAR) analysis using recursive partition and regression tree method.Methods: Employing the enhanced version of a directory of useful decoys, SBVS protocols using molecular docking simulations, and PLIF were constructed and retrospectively validated. To avoid bias, SMILES format of the compounds was used. The predictive abilities of the SBVS protocols were then compared based on the enrichment factor (EF) and the F-measure values.Results: The SBVS protocols resulted in this research were SBVS_1 (employing docking scores of the best pose on every compound to rank the results and selecting compounds within 1% false positives as positive), SBVS_2 (employing decision tree resulted from the binary QSAR analysis using docking scores and PLIF bitstrings of the best pose of every compound as descriptors), and SBVS_3 (employing decision tree resulted from the binary QSAR analysis using ensemble PLIF of the selected poses from optimized docking score as the cutoff). The EF values of SBVS_1, SBVS_2, and SBVS_3 are 28.315, 576.084, and 713.472, respectively, while their F-measure values are 0.310, 0.573, and 0.769, respectively.Conclusion: Highly predictive unbiased SBVS protocols to identify potent estrogen receptor alpha ligands were constructed. Further application in prospective screening is therefore highly suggested
Analyzing multitarget activity landscapes using protein-ligand interaction fingerprints: interaction cliffs.
This is the original submitted version, before peer review. The final peer-reviewed version is available from ACS at http://pubs.acs.org/doi/abs/10.1021/ci500721x.Activity landscape modeling is mostly a descriptive technique that allows rationalizing continuous and discontinuous SARs. Nevertheless, the interpretation of some landscape features, especially of activity cliffs, is not straightforward. As the nature of activity cliffs depends on the ligand and the target, information regarding both should be included in the analysis. A specific way to include this information is using protein-ligand interaction fingerprints (IFPs). In this paper we report the activity landscape modeling of 507 ligand-kinase complexes (from the KLIFS database) including IFP, which facilitates the analysis and interpretation of activity cliffs. Here we introduce the structure-activity-interaction similarity (SAIS) maps that incorporate information on ligand-target contact similarity. We also introduce the concept of interaction cliffs defined as ligand-target complexes with high structural and interaction similarity but have a large potency difference of the ligands. Moreover, the information retrieved regarding the specific interaction allowed the identification of activity cliff hot spots, which help to rationalize activity cliffs from the target point of view. In general, the information provided by IFPs provides a structure-based understanding of some activity landscape features. This paper shows examples of analyses that can be carried out when IFPs are added to the activity landscape model.M-L is very
grateful to CONACyT (No. 217442/312933) and the Cambridge Overseas Trust for funding. AB
thanks Unilever for funding and the European Research Council for a Starting Grant (ERC-2013-
StG-336159 MIXTURE). J.L.M-F. is grateful to the School of Chemistry, Department of
Pharmacy of the National Autonomous University of Mexico (UNAM) for support. This work
was supported by a scholarship from the Secretariat of Public Education and the Mexican
government
Computer-aided Design of Chalcone Derivatives as Lead Compounds Targeting Acetylcholinesterase
One of well-established biological activities for chalcone derivatives is as acetylcholinesterase inhibitors, which can be developed for the therapy of Alzheimer’s disease. Assisted byretrospectively validated structure-based virtual screening (SBVS) protocol to identify potent acetylcholinesterase inhibitors, 80chalcone derivatives were designed and virtually screened. The F-measure value as the parameter of the predictive ability of the SBVS protocol developed in the research presented in this article was 0.413, which was considerably better than the original SBVS protocol (F-measure = 0.226). Among the screened chalcone derivatives two were selected as potential lead compounds to designpotent inhibitors for acetylcholinesterase: 3-[4-(benzyloxy)-3-methoxyphenyl]-1-(4-hydroxy-3-methoxyphenyl)prop-2-en-1-one(3k) and 3-[4-(benzyloxy)-3-methoxyphenyl]-1-(4-hydroxyphenyl)prop-2-en-1-one (4k)
Optimiertes Design kombinatorischer Verbindungsbibliotheken durch Genetische Algorithmen und deren Bewertung anhand wissensbasierter Protein-Ligand Bindungsprofile
In dieser Arbeit sind die zwei neuen Computer-Methoden DrugScore Fingerprint (DrugScoreFP) und GARLig in ihrer Theorie und Funktionsweise vorgestellt und validiert worden.
DrugScoreFP ist ein neuartiger Ansatz zur Bewertung von computergenerierten Bindemodi potentieller Liganden für eine bestimmte Zielstruktur. Das Programm basiert auf der etablierten Bewertungsfunktion DrugScoreCSD und unterscheidet sich darin, dass anhand bereits bekannter Kristallstrukturen für den zu untersuchenden Rezeptor ein Referenzvektor generiert wird, der zu jedem Bindetaschenatom Potentialwerte für alle möglichen Interaktionen enthält. Für jeden neuen, computergenerierten Bindungsmodus eines Liganden lässt sich ein entsprechender Vektor generieren. Dessen Distanz zum Referenzvektor ist ein Maß dafür, wie ähnlich generierte Bindungsmodi zu bereits bekannten sind. Eine experimentelle Validierung der durch DrugScoreFP als ähnlich vorhergesagten Liganden ergab für die in unserem Arbeitskreis untersuchten Proteinstrukturen Trypsin, Thermolysin und tRNA-Guanin Transglykosylase (TGT) sechs Inhibitoren fragmentärer Größe und eine Thermolysin Kristallstruktur in Komplex mit einem der gefundenen Fragmente.
Das in dieser Arbeit entwickelte Programm GARLig ist eine auf einem Genetischen Algorithmus basierende Methode, um chemische Seitenkettenmodifikationen niedermolekularer Verbindungen hinsichtlich eines untersuchten Rezeptors effizient durchzuführen. Zielsetzung ist hier die Zusammenstellung einer Verbindungsbibliothek, welche eine benutzerdefiniert große Untermenge aller möglichen chemischen Modifikationen Ligand-ähnlicher Grundgerüste darstellt. Als zentrales Qualitätskriterium einzelner Vertreter der Verbindungsbibliothek dienen durch Docking erzeugte Ligand-Geometrien und deren Bewertungen durch Protein-Ligand-Bewertungsfunktionen. In mehreren Validierungsszenarien an den Proteinen Trypsin, Thrombin, Faktor Xa, Plasmin und Cathepsin D konnte gezeigt werden, dass eine effiziente Zusammenstellung Rezeptor-spezifischer Substrat- oder Ligand-Bibliotheken lediglich eine Durchsuchung von weniger als 8% der vorgegebenen Suchräume erfordert und GARLig dennoch im Stande ist, bekannte Inhibitoren in der Zielbibliothek anzureichern
- …