210 research outputs found

    Development and application of fast fuzzy pharmacophore-based virtual screening methods for scaffold hopping

    Get PDF
    The goal of this thesis was the development, evaluation and application of novel virtual screening approaches for the rational compilation of high quality pharmacological screening libraries. The criteria for a high quality were a high probability of the selected molecules to be active compared to randomly selected molecules and diversity in the retrieved chemotypes of the selected molecules to be prepared for the attrition of single lead structures. For the latter criterion the virtual screening approach had to perform “scaffold hopping”. The first molecular descriptor that was explicitly reported for that purpose was the topological pharmacophore CATS descriptor, representing a correlation vector (CV) of all pharmacophore points in a molecule. The representation is alignment-free and thus renders fast screening of large databases feasible. In a first series of experiments the CATS descriptor was conceptually extended to the three-dimensional pharmacophore-pair CATS3D descriptor and the molecular surface based SURFCATS descriptor. The scaling of the CATS3D descriptor, the combination of CATS3D with different similarity metrics and the dependence of the CATS3D descriptor on the threedimensional conformations of the molecules in the virtual screening database were evaluated in retrospective screening experiments. The “scaffold hopping” capabilities of CATS3D and SURFCATS were compared to CATS and the substructure fingerprint MACCS keys. Prospective virtual screening with CATS3D similarity searching was applied for the TAR RNA and the metabotropic glutamate receptor 5 (mGlur5). A combination of supervised and unsupervised neural networks trained on CATS3D descriptors was applied prospectively to compile a focused but still diverse library of mGluR5 modulators. In a second series of experiments the SQUID fuzzy pharmacophore model method was developed, that was aimed to provide a more general query for virtual screening than the CATS family descriptors. A prospective application of the fuzzy pharmacophore models was performed for TAR RNA ligands. In a last experiment a structure-/ligand-based pharmacophore model was developed for taspase1 based on a homology model of the enzyme. This model was applied prospectively for the screening for the first inhibitors of taspase1. The effect of different similarity metrics (Euc: Euclidean distance, Manh: Manhattan distance and Tani: Tanimoto similarity) and different scaling methods (unscaled, scaling1: scaling by the number of atoms, and scaling2: scaling by the added incidences of potential pharmacophore points of atom pairs) on CATS3D similarity searching was evaluated in retrospective virtual screening experiments. 12 target classes of the COBRA database of annotated ligands from recent scientific literature were used for that purpose. Scaling2, a new development for the CATS3D descriptor, was shown to perform best on average in combination with all three similarity metrics (enrichment factor ef (1%): Manh = 11.8 ± 4.3, Euc = 11.9 ± 4.6, Tani = 12.8 ± 5.1). The Tanimoto coefficient was found to perform best with the new scaling method. Using the other scaling methods the Manhattan distance performed best (ef (1%): unscaled: Manh = 9.6 ± 4.0, Euc = 8.1 ± 3.5, Tani = 8.3 ± 3.8; scaling1: Manh = 10.3 ± 4.1, Euc = 8.8 ± 3.6, Tani = 9.1 ± 3.8). Since CATS3D is independent of an alignment, the dependence of a “receptor relevant” conformation might also be weaker compared to other methods like docking. Using such methods might be a possibility to overcome problems like protein flexibility or the computational expensive calculation of many conformers. To test this hypothesis, co-crystal structures of 11 target classes served as queries for virtual screening of the COBRA database. Different numbers of conformations were calculated for the COBRA database. Using only a single conformation already resulted in a significant enrichment of isofunctional molecules on average (ef (1%) = 6.0 ± 6.5). This observation was also made for ligand classes with many rotatable bonds (e.g. HIV-protease: 19.3 ± 6.2 rotatable bonds in COBRA, ef (1%) = 12.2 ± 11.8). On average only an improvement from using the maximum number of conformations (on average 37 conformations / molecule) to using single conformations of 1.1 fold was found. It was found that using more conformations actives and inactives equally became more similar to the reference compounds according to the CATS3D representations. Applying the same parameters as before to calculate conformations for the crystal structure ligands resulted in an average Cartesian RMSD of the single conformations to the crystal structure conformations of 1.7 ± 0.7 Å. For the maximum number of conformations, the RMSD decreased to 1.0 ± 0.5 Å (1.8 fold improvement on average). To assess the virtual screening performance and the scaffold hopping potential of CATS3D and SURFACATS, these descriptors were compared to CATS and the MACCS keys, a fingerprint based on exact chemical substructures. Retrospective screening of ten classes of the COBRA database was performed. According to the average enrichment factors the MACCS keys performed best (ef (1%): MACCS = 17.4 ± 6.4, CATS = 14.6 ± 5.4, CATS3D = 13.9 ± 4.9, SURFCATS = 12.2 ± 5.5). The classes, where MACCS performed best, consisted of a lower average fraction of different scaffolds relative to the number of molecules (0.44 ± 0.13), than the classes, where CATS performed best (0.65 ± 0.13). CATS3D was the best performing method for only a single target class with an intermediate fraction of scaffolds (0.55). SURFCATS was not found to perform best for a single class. These results indicate that CATS and the CATS3D descriptors might be better suited to find novel scaffolds than the MACCS keys. All methods were also shown to complement each other by retrieving scaffolds that were not found by the other methods. A prospective evaluation of CATS3D similarity searching was done for metabotropic glutamate receptor 5 (mGluR5) allosteric modulators. Seven known antagonists of mGluR5 with sub-micromolar IC50 were used as reference ligands for virtual screening of the 20,000 most drug-like compounds – as predicted by an artificial neural network approach – of the Asinex vendor database (194,563 compounds). Eight of 29 virtual screening hits were found with a Ki below 50 µM in a binding assay. Most of the ligands were only moderately specific for mGluR5 (maximum of > 4.2 fold selectivity) relative to mGluR1, the most similar receptor to mGluR5. One ligand exhibited even a better Ki for mGluR1 than for mGluR5 (mGluR5: Ki > 100 µM, mGluR1: Ki = 14 µM). All hits had different scaffolds than the reference molecules. It was demonstrated that the compiled library contained molecules that were different from the reference structures – as estimated by MACCS substructure fingerprints – but were still considered isofunctional by both CATS and CATS3D pharmacophore approaches. Artificial neural networks (ANN) provide an alternative to similarity searching in virtual screening, with the advantage that they incorporate knowledge from a learning procedure. A combination of artificial neural networks for the compilation of a focused but still structurally diverse screening library was employed prospectively for mGluR5. Ensembles of neural networks were trained on CATS3D representations of the training data for the prediction of “mGluR5-likeness” and for “mGluR5/mGluR1 selectivity”, the most similar receptor to mGluR5, yielding Matthews cc between 0.88 and 0.92 as well as 0.88 and 0.91 respectively. The best 8,403 hits (the focused library: the intersection of the best hits from both prediction tasks) from virtually ranking the Enamine vendor database (ca. 1,000,000 molecules), were further analyzed by two self-organizing maps (SOMs), trained on CATS3D descriptors and on MACCS substructure fingerprints. A diverse and representative subset of the hits was obtained by selecting the most similar molecules to each SOM neuron. Binding studies of the selected compounds (16 molecules from each map) gave that three of the molecules from the CATS3D SOM and two of the molecules from the MACCS SOM showed mGluR5 binding. The best hit with a Ki of 21 µM was found in the CATS3D SOM. The selectivity of the compounds for mGluR5 over mGluR1 was low. Since the binding pockets in the two receptors are similar the general CATS3D representation might not have been appropriate for the prediction of selectivity. In both SOMs new active molecules were found in neurons that did not contain molecules from the training set, i. e. the approach was able to enter new areas of chemical space with respect to mGluR5. The combination of supervised and unsupervised neural networks and CATS3D seemed to be suited for the retrieval of dissimilar molecules with the same class of biological activity, rather than for the optimization of molecules with respect to activity or selectivity. A new virtual screening approach was developed with the SQUID (Sophisticated Quantification of Interaction Distributions) fuzzy pharmacophore method. In SQUID pairs of Gaussian probability densities are used for the construction of a CV descriptor. The Gaussians represent clusters of atoms comprising the same pharmacophoric feature within an alignment of several active reference molecules. The fuzzy representation of the molecules should enhance the performance in scaffold hopping. Pharmacophore models with different degrees of fuzziness (resolution) can be defined which might be an appropriate means to compensate for ligand and receptor flexibility. For virtual screening the 3D distribution of Gaussian densities is transformed into a two-point correlation vector representation which describes the probability density for the presence of atom-pairs, comprising defined pharmacophoric features. The fuzzy pharmacophore CV was used to rank CATS3D representations of molecules. The approach was validated by retrospective screening for cyclooxygenase 2 (COX-2) and thrombin ligands. A variety of models with different degrees of fuzziness were calculated and tested for both classes of molecules. Best performance was obtained with pharmacophore models reflecting an intermediate degree of fuzziness. Appropriately weighted fuzzy pharmacophore models performed better in retrospective screening than CATS3D similarity searching using single query molecules, for both COX-2 and thrombin (ef (1%): COX-2: SQUID = 39.2., best CATS3D result = 26.6; Thrombin: SQUID = 18.0, best CATS3D result = 16.7). The new pharmacophore method was shown to complement MOE pharmacophore models. SQUID fuzzy pharmacophore and CATS3D virtual screening were applied prospectively to retrieve novel scaffolds of RNA binding molecules, inhibiting the Tat-TAR interaction. A pharmacophore model was built up from one ligand (acetylpromazine, IC50 = 500 µM) and a fragment of another known ligand (CGP40336A), which was assumed to bind with a comparable binding mode as acetylpromazine. The fragment was flexible aligned to the TAR bound NMR conformation of acetylpromazine. Using an optimized SQUID pharmacophore model the 20,000 most druglike molecules from the SPECS database (229,658 compounds) were screened for Tat-TAR ligands. Both reference inhibitors were also applied for CATS3D similarity searching. A set of 19 molecules from the SQUID and CATS3D results was selected for experimental testing. In a fluorescence resonance energy transfer (FRET) assay the best SQUID hit showed an IC50 value of 46 µM, which represents an approximately tenfold improvement over the reference acetylpromazine. The best hit from CATS3D similarity searching showed an IC50 comparable to acetylpromazine (IC50 = 500 µM). Both hits contained different molecular scaffolds than the reference molecules. Structure-based pharmacophores provide an alternative to ligand-based approaches, with the advantage that no ligands have to be known in advance and no topological bias is introduced. The latter is e.g. favorable for hopping from peptide-like substrates to drug-like molecules. A homology model of the threonine aspartase taspase1 was calculated based on the crystal structures of a homologous isoaspartyl peptidase. Docking studies of the substrate with GOLD identified a binding mode where the cleaved bond was situated directly above the reactive N-terminal threonine. The predicted enzyme-substrate complex was used to derive a pharmacophore model for virtual screening for novel taspase1 inhibitors. 85 molecules were identified from virtual screening with the pharmacophore model as potential taspase1- inhibitors, however biochemical data was not available before the end of this thesis. In summary this thesis demonstrated the successful development, improvement and application of pharmacophore-based virtual screening methods for the compilation of molecule-libraries for early phase drug development. The highest potential of such methods seemed to be in scaffold hopping, the non-trivial task of finding different molecules with the same biological activity.Ziel dieser Arbeit war die Entwicklung, Untersuchung und Anwendung von neuen virtuellen Screening-Verfahren für den rationalen Entwurf hoch-qualitativer Molekül-Datenbanken für das pharmakologische Screening. Anforderung für eine hohe Qualität waren eine hohe a priori Wahrscheinlichkeit für das Vorhandensein aktiver Moleküle im Vergleich zu zufällig zusammengestellten Bibliotheken, sowie das Vorhandensein einer Vielfalt unterschiedlicher Grundstrukturen unter den selektierten Molekülen, um gegen den Ausfall einzelner Leitstrukturen in der weiteren Entwicklung abgesichert zu sein. Notwendig für die letztere Eigenschaft ist die Fähigkeit eines Verfahrens zum „Grundgerüst-Springen“. Der erste Molekül-Deskriptor, der explizit für das „Grundgerüst-Springen“ eingesetzt wurde war der CATS Deskriptor – ein topologischer Korrelations-Vektor („correlation vector“, CV) über alle Pharmakophor-Punkte eines Moleküls. Der Vergleich von Molekülen über den CATS Deskriptor geschieht ohne eine Überlagerung der Moleküle, was den effizienten Einsatz solcher Verfahren für sehr große Molekül-Datenbanken ermöglicht. In einer ersten Serie von Versuchen wurde der CATS Deskriptor erweitert zu dem dreidimensionalen CATS3D Deskriptor und dem auf der Molekül-Oberfläche basierten SURFCATS Deskriptor. In retrospektiven Studien wurde für diese Deskriptoren der Einfluss verschiedener Skalierungs-Methoden, die Kombination mit unterschiedlichen Ähnlichkeits- Metriken und die Auswirkung verschiedener dreidimensionaler Konformationen untersucht. Weiter wurden das Potential der entwickelten Deskriptoren CATS3D und SURFCATS im „Grundgerüst-Springen“ mit CATS und dem Substruktur-Fingerprint MACCS keys verglichen. Prospektive Anwendungen der CATS3D Ähnlichkeitssuche wurden für die TARRNA und den metabotropen Glutamat Rezeptor 5 (mGluR5) durchgeführt. Eine Kombination von überwachten und unüberwachten neuronalen Netzen wurde prospektiv für die Zusammenstellung einer fokussierten aber dennoch diversen Bibliothek von mGluR5 Modulatoren eingesetzt. In einer zweiten Reihe von Versuchen wurde der SQUID Fuzzy Pharmakophor Ansatz entwickelt, mit dem Ziel zu einer noch generelleren Molekül- Beschreibung als mit den Deskriptoren aus der CATS Familie zu gelangen. Eine prospektive Anwendung der „Fuzzy Pharmakophor“ Methode wurde für die TAR-RNA durchgeführt. In einem letzten Versuch wurde für Taspase1 ein Struktur-/Liganden-basiertes Pharmakophor- Modell auf der Grundlage eines Homologie-Modells des Enzyms entwickelt. Dieses wurde für das prospektive Screening nach Taspase1-Inhibitoren eingesetzt. Der Einfluss verschiedener Ähnlichkeits-Metriken (Euk: Euklidische Distanz, Manh: Manhattan Distanz, Tani: Tanimoto Ähnlichkeit) und verschiedener Skalierungs-Methoden (Ohne-Skalierung, Skalierung1: Skalierung aller Werte nach der Anzahl Atome, Skalierung2: Skalierung der Werte eines Paares von Pharmakophor-Punkten entsprechend der Summe aller Pharmakophor-Punkte mit denselben Pharmakophor-Typen) auf die Ähnlichkeits-Suche mit CATS3D wurde in retrospektiven virtuellen Screening Experimenten untersucht. Für diesen Zweck wurden 12 verschiedene Klassen von Rezeptoren und Enzymen aus der COBRA Datenbank von annotierten Liganden aus der jüngeren wissenschaftlichen Literatur eingesetzt. Skalierung2, eine neue Entwicklung für CATS3D, zeigte im Durchschnitt die beste Performanz in Kombination mit allen drei Ähnlichkeits-Metriken (Anreicherungs-Faktor ef (1%): Manh = 11,8 ± 4,3; Euk = 11,9 ± 4,6; Tani = 12,8 ± 5,1). Die Kombination von Skalierung2 mit dem Tanimoto Ähnlichkeits-Koeffizienten lieferte die besten Ergebnisse. In Kombination mit den anderen Skalierungen brachte die Manhattan Distanz die besten Ergebnisse (ef (1%): Ohne-Skalierung: Manh = 9,6 ± 4,0; Euk = 8,1 ± 3,5; Tani = 8,3 ± 3,8; Skalierung1: Manh = 10,3 ± 4,1; Euk = 8,8 ± 3,6; Tani = 9,1 ± 3,8). Da die CATS3D Ähnlichkeits-Suche unabhängig von der Überlagerung einzelner Moleküle ist, könnte ebenfalls eine gewisse Unabhängigkeit von der vorhandenen 3D Konformation bestehen. Eine solche Unabhängigkeit wäre interessant um die zeitaufwendige Berechnung multipler Konformationen zu umgehen. Um diese Hypothese zu untersuchen wurden Co-Kristalle von Liganden aus 11 Klassen von Rezeptoren und Enzymen ausgewählt, um als Anfrage-Strukturen im virtuellen Screening in der COBRA Datenbank zu dienen. Verschiedene Versionen der COBRA Datenbank mit unterschiedlicher Anzahl Konformationen wurden berechnet. Bereits mit einer einzigen Konformation pro Molekül konnte im Mittel eine deutliche Anreicherung an aktiven Molekülen beobachte werden (ef (1%) = 6,0 ± 6,5). Diese Beobachtung beinhaltete auch Klassen von Molekülen mit vielen rotierbaren Bindungen. (z.B. HIV-Protease: 19,3 ± 6,2 rotierbare Bindungen in COBRA, ef (1%) = 12,2 ± 11,8). Im Mittel konnten dazu bei Verwendung der maximalen Anzahl Konformationen (durchschnittlich 37 Konformationen / Molekül) nur eine Verbesserung von 1.1 festgestellt werden. Nach der CATS3D Ähnlichkeit wurden die inaktiven Moleküle im gleichen Maß ähnlicher zu den Referenzen als die aktiven Moleküle. Zum Vergleich konnte durch Verwendung multipler statt einzelner Konformationen eine 1,8-fache Verbesserung des RMSD zu den Konformationen aus den Kristall-Struktur Konformationen erreicht werden (einzelne Konformationen: 1,7 ± 0,7 Å; max. Konformationen: 1,0 ± 0,5 Å). Um die Leistungsfähigkeit von CATS3D und SURFCATS im virtuellen Screening und im Grundgerüst-Springen zu beurteilen, wurden diese Deskriptoren mit CATS und den MACCS keys, einem Fingerprint basierend auf exakten chemischen Substrukturen, verglichen. Für die retrospektive Analyse wurden 10 Klassen von Rezeptoren und Enzymen aus der COBRA Datenbank ausgewählt. Nach den mittleren Anreicherungs-Faktoren ergaben sich für MACCS die besten Resultate (ef (1%): MACCS = 17,4 ± 6,4; CATS = 14,6 ± 5,4; CATS3D = 13,9 ± 4,9; SURFCATS = 12,2 ± 5,5). Es zeigte sich, dass die Klassen, in denen MACCS die besten Ergebnisse erzielen konnte, einen geringen gemittelten Anteil von verschiedenen Grundgerüsten aufwiesen im Verhältnis zu der Anzahl an Molekülen (0,44 ± 0,13) als die Klassen, in denen CATS am besten war (0,65 ± 0,13). CATS3D war nur in einer Klasse mit einem mittleren Anteil von Grundgerüsten (0,55) die beste Methode. SURFCATS war für keine Klasse besser als alle anderen Methoden. Diese Ergebnisse deuten darauf hin, dass Methoden wie CATS und CATS3D besser geeignet sind, um neue Grundgerüste zu finden. Es konnte weiter gezeigt werden, dass sich die Methoden einander ergänzen, dass also mit jeder Methode Grundgerüste gefunden werden konnten, die mit keiner der anderen Methoden gefunden werden konnten. Eine prospektive Anwendung wurde für CATS3D in der Suche nach neuen allosterischen Modulatoren des metabotropen Glutamat Rezeptors 5 (mGluR5) durchgeführt. Sieben bekannte allosterische mGluR5 Antagonisten mit sub-mikromolaren IC50 Werten wurde als Referenzen eingesetzt. Das virtuelle Screening wurde auf den 20.000 von einem künstlichen neuronalen Netz als am wirkstoff-artigsten vorhergesagten Molekülen der Asinex Datenbank (194.563 Moleküle) durchgeführt. Acht der 29 gefundenen Hits aus dem virtuellen Screening zeigten Ki Werte unter 50 µM in einem Bindungs-Assay. Die Mehrheit der Liganden zeigte nur eine geringe Selektivität (Maximum > 4,2-fach) gegenüber mGluR1, dem ähnlichsten Rezeptor zu mGluR5. Einer der Liganden zeigte einen besseren Ki für mGluR1 als für mGluR5 (mGluR5: Ki > 100 µM, mGluR1: Ki = 14 µM). Alle gefundenen Moleküle zeigten verschiedene Grundgerüste als die Referenz Moleküle. Es konnte gezeigt werden, dass die zusammengestellte Bibliothek von den MACCS keys als unterschiedlich zu den Referenz Strukturen betrachtet wurden, von CATS und CATS3D aber noch als isofunktional betracht wurden. Künstliche neuronal Netze („artificial neural net“, ANN) bieten eine Alternative zur Ähnlichkeits-Suche im virtuellen Screening mit dem Vorteil, dass in einer Serie von Liganden enthaltenes implizites Wissen über eine Lernprozedur in ein Modell integrierte werden kann. Eine Kombination von ANNs für die Zusammenstellung einer fokussierten aber dennoch diversen Molekül-Bibliothek wurde prospektiv für die Suche nach mGluR5 Antagonisten eingesetzt. Gruppen von ANNs wurden auf den Basis von CATS3D Repräsentationen für die Vorhersage von „mGluR5-artigkeit“ und „mGluR5/mGluR1 Selektivität“ trainiert. Dabei ergaben sich Matthews cc zwischen 0,88 und 0,92 sowie zwischen 0,88 und 0,91. Die besten 8.403 Hits (die Schnittmenge der besten Hits aus beiden Vorhersagen) aus einem virtuellen Screening der Enamine Datenbank (ca. 1.000.000 Moleküle) ergab die fokussierte Bibliothek. Diese wurde weiter mit Selbstor

    Next generation 3D pharmacophore modeling

    Get PDF
    3D pharmacophore models are three‐dimensional ensembles of chemically defined interactions of a ligand in its bioactive conformation. They represent an elegant way to decipher chemically encoded ligand information and have therefore become a valuable tool in drug design. In this review, we provide an overview on the basic concept of this method and summarize key studies for applying 3D pharmacophore models in virtual screening and mechanistic studies for protein functionality. Moreover, we discuss recent developments in the field. The combination of 3D pharmacophore models with molecular dynamics simulations could be a quantum leap forward since these approaches consider macromolecule–ligand interactions as dynamic and therefore show a physiologically relevant interaction pattern. Other trends include the efficient usage of 3D pharmacophore information in machine learning and artificial intelligence applications or freely accessible web servers for 3D pharmacophore modeling. The recent developments show that 3D pharmacophore modeling is a vibrant field with various applications in drug discovery and beyond

    Multi-faceted Structure-Activity Relationship Analysis Using Graphical Representations

    Get PDF
    A core focus in medicinal chemistry is the interpretation of structure-activity relationships (SARs) of small molecules. SAR analysis is typically carried out on a case-by-case basis for compound sets that share activity against a given target. Although SAR investigations are not a priori dependent on computational approaches, limitations imposed by steady rise in activity information have necessitated the use of such methodologies. Moreover, understanding SARs in multi-target space is extremely difficult. Conceptually different computational approaches are reported in this thesis for graphical SAR analysis in single- as well as multi-target space. Activity landscape models are often used to describe the underlying SAR characteristics of compound sets. Theoretical activity landscapes that are reminiscent of topological maps intuitively represent distributions of pair-wise similarity and potency difference information as three-dimensional surfaces. These models provide easy access to identification of various SAR features. Therefore, such landscapes for actual data sets are generated and compared with graph-based representations. Existing graphical data structures are adapted to include mechanism of action information for receptor ligands to facilitate simultaneous SAR and mechanism-related analyses with the objective of identifying structural modifications responsible for switching molecular mechanisms of action. Typically, SAR analysis focuses on systematic pair-wise relationships of compound similarity and potency differences. Therefore, an approach is reported to calculate SAR feature probabilities on the basis of these pair-wise relationships for individual compounds in a ligand set. The consequent expansion of feature categories improves the analysis of local SAR environments. Graphical representations are designed to avoid a dependence on preconceived SAR models. Such representations are suitable for systematic large-scale SAR exploration. Methods for the navigation of SARs in multi-target space using simple and interpretable data structures are introduced. In summary, multi-faceted SAR analysis aided by computational means forms the primary objective of this dissertation

    Development and Interpretation of Machine Learning Models for Drug Discovery

    Get PDF
    In drug discovery, domain experts from different fields such as medicinal chemistry, biology, and computer science often collaborate to develop novel pharmaceutical agents. Computational models developed in this process must be correct and reliable, but at the same time interpretable. Their findings have to be accessible by experts from other fields than computer science to validate and improve them with domain knowledge. Only if this is the case, the interdisciplinary teams are able to communicate their scientific results both precisely and intuitively. This work is concerned with the development and interpretation of machine learning models for drug discovery. To this end, it describes the design and application of computational models for specialized use cases, such as compound profiling and hit expansion. Novel insights into machine learning for ligand-based virtual screening are presented, and limitations in the modeling of compound potency values are highlighted. It is shown that compound activity can be predicted based on high-dimensional target profiles, without the presence of molecular structures. Moreover, support vector regression for potency prediction is carefully analyzed, and a systematic misprediction of highly potent ligands is discovered. Furthermore, a key aspect is the interpretation and chemically accessible representation of the models. Therefore, this thesis focuses especially on methods to better understand and communicate modeling results. To this end, two interactive visualizations for the assessment of naive Bayes and support vector machine models on molecular fingerprints are presented. These visual representations of virtual screening models are designed to provide an intuitive chemical interpretation of the results

    Development and optimisation of computational tools for drug discovery

    Get PDF
    The aim of my PhD project was the development, optimisation, and implementation of new in silico virtual screening protocols. Specifically, this thesis manuscript is divided into three main parts, presenting some of the papers published during my doctoral work. The first one, here named CHEMOMETRIC PROTOCOLS IN DRUG DISCOVERY, is about the optimisation and application of an in house developed chemometric protocol. This part has been entirely developed at the University of Palermo - STEBICEF Department - under the guide of my supervisors. During the development of this part I have personally worked on the tuning and optimisation of the algorithm and on the docking campaigns to obtain molecule conformaitons. The second part, THE APPLICATION OF MOLECULAR DYNAMICS TO VIRTUAL SCREENING, presents a new approach to virtual screening, in particular the attention is focused on different approaches to the application of protein flexibility and dynamics to virtual screening. This part, has been carried out in cooperation with the University of Vienna - Department of Pharmaceutical Chemistry. For these works I have worked in the development of the general workflow, to a lesser extent to the programming (coding) part of the applications used and I mainly focused on the realisation of the screening campaigns and results interpretation. The third and last part, COMPUTATIONAL CHEMISTRY IN POLY-PHARMACOLOGY AND DRUG REPURPOSING, concerns the study of the in silico methods applied to two main topics of the drug discovery process, such as the drug repurposing and the polypharmacology. In this part I will briefly describe what published in two reviews dealing to the above mentioned topics. In conclusion during this doctoral project, I have demonstrated how the use of in silico tools can be useful in the drug discovery process. The Chemometric protocols developed and optimised represent in fact a helpful strategy to use for target fishing. Whereas, the application of molecular dynamics to virtual screening, especially for pharmacophore modelling, is a new way to deepen crucial features to be adopted in the search of new putative active compounds.The aim of my PhD project was the development, optimisation, and implementation of new in silico virtual screening protocols. Specifically, this thesis manuscript is divided into three main parts, presenting some of the papers published during my doctoral work. The first one, here named CHEMOMETRIC PROTOCOLS IN DRUG DISCOVERY, is about the optimisation and application of an in house developed chemometric protocol. This part has been entirely developed at the University of Palermo - STEBICEF Department - under the guide of my supervisors. During the development of this part I have personally worked on the tuning and optimisation of the algorithm and on the docking campaigns to obtain molecule conformaitons. The second part, THE APPLICATION OF MOLECULAR DYNAMICS TO VIRTUAL SCREENING, presents a new approach to virtual screening, in particular the attention is focused on different approaches to the application of protein flexibility and dynamics to virtual screening. This part, has been carried out in cooperation with the University of Vienna - Department of Pharmaceutical Chemistry. For these works I have worked in the development of the general workflow, to a lesser extent to the programming (coding) part of the applications used and I mainly focused on the realisation of the screening campaigns and results interpretation. The third and last part, COMPUTATIONAL CHEMISTRY IN POLY-PHARMACOLOGY AND DRUG REPURPOSING, concerns the study of the in silico methods applied to two main topics of the drug discovery process, such as the drug repurposing and the polypharmacology. In this part I will briefly describe what published in two reviews dealing to the above mentioned topics. In conclusion during this doctoral project, I have demonstrated how the use of in silico tools can be useful in the drug discovery process. The Chemometric protocols developed and optimised represent in fact a helpful strategy to use for target fishing. Whereas, the application of molecular dynamics to virtual screening, especially for pharmacophore modelling, is a new way to deepen crucial features to be adopted in the search of new putative active compounds

    Theoretical Analysis of Biomolecular Systems: Computational Simulations, Core-set Markov State Models, Clustering, Molecular Docking

    Get PDF
    The analysis of the structural and the dynamical behavior of biomolecules is very important to under- stand their biological function, stability or physico-chemical properties. In this thesis, it is highlighted how different theoretical methods to characterize the aforementioned structural and dynamical properties can be used and combined, to obtain kinetic information or to detect biomolecule-ligand interactions. The basis for most of the analyses, performed in the course of this work, are molecular dynamics sim- ulations sampling the conformational space of the biomolecule of interest. Using molecular dynamics simulations, the remarkable stable water-soluble-binding-protein is examined first. On a theoretical ba- sis, structural modifications that can influence the stability of the protein are discussed. Additionally, by combining the simulations with a QM/MM optimization scheme and quantum chemical calculations, spectroscopical properties can be investigated. Markov State Models are applied frequently to capture the slow dynamics within simulation trajectories. They are based on a discretization of the conformational space. This discretization, however, introduces an error in the outcome of the analysis. The application of a core-set discretization can reduce this error. In this thesis, it is discussed how density-based cluster algorithms can be used to determine these core sets, and the application on linear and cyclic peptides is highlighted. The performance of a promising cluster algorithm is investigated and error sources in the construction of the Markov models are discussed. Finally, it is shown how molecular docking combined with molecular dynamics simulations can be used to determine the binding behavior of ligands towards biomolecules. In this context, the important in- teractions within the active site of an enzyme, and different binding modes of DNA intercalators are identified

    Targeting The Dimerization Of ERBB Receptor Tyrosine Kinases

    Get PDF
    The epidermal growth factor receptor: EGFR) is a membrane receptor tyrosine kinase whose over-activation has been implicated to cause many human cancers. Novel strategies to inhibit the activation of EGF receptors other than the conventional antibody-based and tyrosine kinase inhibitors are virtually non-existent but could provide benefits both in the laboratory and clinical settings. In an effort to expand the current approaches, this thesis focused on targeting the homodimerization of the EGF receptors themselves and the heterodimerization of EGF receptors with the related ErbB2 receptor. Three sub-projects were completed in the process. The first project explored the feasibility of inhibiting the EGF receptor by targeting receptor dimerization with small molecules. Two lead compounds were initially predicted by virtual screening the NCI compound library, and were biochemically characterized. The benefit gained from the application of virtual screening in this project initiated another project to enhance the accessibility of virtual screening within the non-computational community. The OpenScreening project utilizes distributed computing resources and provides open-access screening server at: http://omg.phy.umassd.edu/xvhts. A final project identified the structural mechanism that may explain the observed preference of EGFR-ErbB2 heterodimerization over EGFR homodimerization. Key residues were computationally predicted and biochemically tested to reveal critical dimerization interface
    corecore