45 research outputs found

    A novel chemogenomics analysis of G protein-coupled receptors (GPCRs) and their ligands: a potential strategy for receptor de-orphanization.

    Get PDF
    BACKGROUND: G protein-coupled receptors (GPCRs) represent a family of well-characterized drug targets with significant therapeutic value. Phylogenetic classifications may help to understand the characteristics of individual GPCRs and their subtypes. Previous phylogenetic classifications were all based on the sequences of receptors, adding only minor information about the ligand binding properties of the receptors. In this work, we compare a sequence-based classification of receptors to a ligand-based classification of the same group of receptors, and evaluate the potential to use sequence relatedness as a predictor for ligand interactions thus aiding the quest for ligands of orphan receptors. RESULTS: We present a classification of GPCRs that is purely based on their ligands, complementing sequence-based phylogenetic classifications of these receptors. Targets were hierarchically classified into phylogenetic trees, for both sequence space and ligand (substructure) space. The overall organization of the sequence-based tree and substructure-based tree was similar; in particular, the adenosine receptors cluster together as well as most peptide receptor subtypes (e.g. opioid, somatostatin) and adrenoceptor subtypes. In ligand space, the prostanoid and cannabinoid receptors are more distant from the other targets, whereas the tachykinin receptors, the oxytocin receptor, and serotonin receptors are closer to the other targets, which is indicative for ligand promiscuity. In 93% of the receptors studied, de-orphanization of a simulated orphan receptor using the ligands of related receptors performed better than random (AUC > 0.5) and for 35% of receptors de-orphanization performance was good (AUC > 0.7). CONCLUSIONS: We constructed a phylogenetic classification of GPCRs that is solely based on the ligands of these receptors. The similarities and differences with traditional sequence-based classifications were investigated: our ligand-based classification uncovers relationships among GPCRs that are not apparent from the sequence-based classification. This will shed light on potential cross-reactivity of GPCR ligands and will aid the design of new ligands with the desired activity profiles. In addition, we linked the ligand-based classification with a ligand-focused sequence-based classification described in literature and proved the potential of this method for de-orphanization of GPCRs.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    Computational Methods for the Integration of Biological Activity and Chemical Space

    Get PDF
    One general aim of medicinal chemistry is the understanding of structure-activity relationships of ligands that bind to biological targets. Advances in combinatorial chemistry and biological screening technologies allow the analysis of ligand-target relationships on a large-scale. However, in order to extract useful information from biological activity data, computational methods are needed that link activity of ligands to their chemical structure. In this thesis, it is investigated how fragment-type descriptors of molecular structure can be used in order to create a link between activity and chemical ligand space. First, an activity class-dependent hierarchical fragmentation scheme is introduced that generates fragmentation pathways that are aligned using established methodologies for multiple alignment of biological sequences. These alignments are then used to extract consensus fragment sequences that serve as a structural signature for individual biological activity classes. It is also investigated how defined, chemically intuitive molecular fragments can be organized based on their topological environment and co-occurrence in compounds active against closely related targets. Therefore, the Topological Fragment Index is introduced that quantifies the topological environment complexity of a fragment in a given molecule, and thus goes beyond fragment frequency analysis. Fragment dependencies have been established on the basis of common topological environments, which facilitates the identification of activity class-characteristic fragment dependency pathways that describe fragment relationships beyond structural resemblance. Because fragments are often dependent on each other in an activity class-specific manner, the importance of defined fragment combinations for similarity searching is further assessed. Therefore, Feature Co-occurrence Networks are introduced that allow the identification of feature cliques characteristic of individual activity classes. Three differently designed molecular fingerprints are compared for their ability to provide such cliques and a clique-based similarity searching strategy is established. For molecule- and activity class-centric fingerprint designs, feature combinations are shown to improve similarity search performance in comparison to standard methods. Moreover, it is demonstrated that individual features can form activity-class specific combinations. Extending the analysis of feature cliques characteristic of individual activity classes, the distribution of defined fragment combinations among several compound classes acting against closely related targets is assessed. Fragment Formal Concept Analysis is introduced for flexible mining of complex structure-activity relationships. It allows the interactive assembly of fragment queries that yield fragment combinations characteristic of defined activity and potency profiles. It is shown that pairs and triplets, rather than individual fragments distinguish between different activity profiles. A classifier is built based on these fragment signatures that distinguishes between ligands of closely related targets. Going beyond activity profiles, compound selectivity is also analyzed. Therefore, Molecular Formal Concept Analysis is introduced for the systematic mining of compound selectivity profiles on a whole-molecule basis. Using this approach, structurally diverse compounds are identified that share a selectivity profile with selected template compounds. Structure-selectivity relationships of obtained compound sets are further analyzed

    Development and prospective application of chemoinformatic tools to explore new ligand chemistry and protein biology

    Get PDF
    Drug discovery and design is a tedious and expensive process whose small chances of success necessitates the development of novel chemoinformatic approaches and concepts. Their common goal is the efficient and robust identification of promising chemical matter and the reliable prediction of its properties. Computer-aided drug discovery and design (CADDD) and its multifarious installments throughout the different phases of the drug discovery pipeline contribute significantly to the expansion of the hits, the understanding of their structure-activity relationship and their rational diversification. They alleviate the development’s costs and its time-demand thus support the search for the needle in the haystack – a potent hit. The HTS-driven brute-force nature of current and of the decades’ past discovery and design strategies compelled researchers to develop ideas and algorithms in order to interfere with the pipeline and prevent its frequent failures. In the introduction, I describe the drug discovery and design pipeline and point out interfaces where CADDD contributes to its success. In Part 1 of this thesis, I present a novel methodology that supports the early-stage hit discovery processes through a fragment-based reduced graph similarity approach (RedFrag). It is a chimeric algorithm that combines fingerprint-based similarity calculation with scaffold-hopping-enabling graph isomorphism. We thoroughly investigated its performance retro- and prospectively. It uses a new type of reduced graph that does not suffer from information loss during its construction and bypasses the necessity of feature definitions. Built upon chemical epitopes resulting from molecule fragmentation, the reduced graph embodies physico-chemical and 2D-structural properties of a molecule. Reduced graphs are compared with a continuous-similarity-distance-driven maximal common subgraph algorithm, which calculates similarity at the fragmental and topological levels. The second chapter, Part 2, is dedicated to PrenDB: A digital compendium of the reaction space of prenyltransferases of the dimethylallyltryptophan synthase (DMATS) superfamily. Their catalytical transformations represent a major skeletal diversification step in the biosynthesis of secondary metabolites including the indole alkaloids. DMATS enzymes thus contribute significantly to the biological and pharmacological diversity of small molecule metabolites. The attachment of the prenyl donor to lead- or drug-like molecules renders the prenyltransferases useful in the access of chemical space that is difficult to reach by conventional synthesis. In PrenDB, we collected the substrates, enzymes and products. We then used a newly developed algorithm based on molecular fragmentation to automatically extract reactive chemical epitopes. The analysis of the collected data sheds light on the thus far explored substrate space of DMATS enzymes. We supplemented the browsable database with algorithmic prediction routines in order to assess the prenylability of novel compounds and did so for a set of 38 molecules. In a case study, Part 3, we investigated the regioselectivity of five prenyltransferases in the presence of unnatural prenyl donors. Detailed biochemical investigations revealed the acceptance of these dimethylallyl pyrophosphate (DMAPP) analogs by all tested enzymes with different relative activities and regioselectivities. In order to understand the activity profiles and their differences on a molecular level we investigated the interaction within the enzyme-prenyl donor-substrate system with molecular dynamics. Our experiments show that the reactivity of a prenyl donor strongly correlates with the distance of its electrophilic, reactive atom and the nucleophilic center of the substrate molecule. It renders the first step towards a better mechanistic understanding of the reactivity of prenyltransferases and expands significantly the potential usage and rational design of tryptophan prenylating enzymes as biocatalysts for Friedel–Crafts alkylation. Lastly, in Part 4, we present the synergistic potential of combined ligand- and structure-based drug discovery methodologies applied to the β2-adrenergic receptor (β2AR). The β2AR is a G protein-coupled receptor (GPCR) and a well-explored target. By the joint application of fingerprint-based similarity, substructure-based searches and docking we discovered 13 ligands – ten of which were novel – of this particular GPCR. Of note, two of the molecules used as starting points for the similarity and substructure searches distinguish themselves from other β2AR antagonists by their unique scaffold. Thus, the usage of a multistep hierarchical or parallel screening approach enabled us to use these unique structural features and discover novel chemical matter beyond the bounds of the ligand space known so far and emphasize the intrinsic complementarity of ligand- and structure-based approaches. The molecules described in this work allow us to explore the ligand space around the previously reported molecules in greater detail, leading to insights into their structure-activity relationship. In addition, we also characterized our hits with experimental binding and selectivity data and discussed it based on their putative binding modes derived by docking

    Multi-faceted Structure-Activity Relationship Analysis Using Graphical Representations

    Get PDF
    A core focus in medicinal chemistry is the interpretation of structure-activity relationships (SARs) of small molecules. SAR analysis is typically carried out on a case-by-case basis for compound sets that share activity against a given target. Although SAR investigations are not a priori dependent on computational approaches, limitations imposed by steady rise in activity information have necessitated the use of such methodologies. Moreover, understanding SARs in multi-target space is extremely difficult. Conceptually different computational approaches are reported in this thesis for graphical SAR analysis in single- as well as multi-target space. Activity landscape models are often used to describe the underlying SAR characteristics of compound sets. Theoretical activity landscapes that are reminiscent of topological maps intuitively represent distributions of pair-wise similarity and potency difference information as three-dimensional surfaces. These models provide easy access to identification of various SAR features. Therefore, such landscapes for actual data sets are generated and compared with graph-based representations. Existing graphical data structures are adapted to include mechanism of action information for receptor ligands to facilitate simultaneous SAR and mechanism-related analyses with the objective of identifying structural modifications responsible for switching molecular mechanisms of action. Typically, SAR analysis focuses on systematic pair-wise relationships of compound similarity and potency differences. Therefore, an approach is reported to calculate SAR feature probabilities on the basis of these pair-wise relationships for individual compounds in a ligand set. The consequent expansion of feature categories improves the analysis of local SAR environments. Graphical representations are designed to avoid a dependence on preconceived SAR models. Such representations are suitable for systematic large-scale SAR exploration. Methods for the navigation of SARs in multi-target space using simple and interpretable data structures are introduced. In summary, multi-faceted SAR analysis aided by computational means forms the primary objective of this dissertation

    Systematic Identification of Scaffolds Representing Different Types of Structure-Activity Relationships

    Get PDF
    In medicinal chemistry, it is of central importance to understand structure-activity relationships (SARs) of small bioactive compounds. Typically, SARs are analyzed on a case-by-case basis for sets of compounds active against a given target. However, the increasing amount of compound activity data that is becoming available allows SARs to be explored on a large-scale. Moreover, molecular scaffolds derived from bioactive compounds are also of high interest for SAR analysis. In general, scaffolds are obtained by removing all substituents from rings and from linkers between rings. This thesis aims at systematically mining compounds for which activity annotations are available and investigating relationships between chemical structure and biological activities at the level of active compounds, in particular, molecular scaffolds. Therefore, data mining approaches are designed to identify scaffolds with different structural and/or activity characteristics. Initially, scaffold distributions in compounds at different stages of pharmaceutical development are analyzed. Sets of scaffolds that overlap between different stages or preferentially occur at certain stages are identified. Furthermore, a systematic selectivity profile analysis of public domain active compounds is carried out. Scaffolds that yield compounds selective for communities of closely related targets and represent compounds selective only for one particular target over others are identified. In addition, the degree of promiscuity of scaffolds is thoroughly examined. Eighty-three scaffolds covering 33 chemotypes correspond to compounds active against at least three different target families and thus are considered to be promiscuous. Moreover, by integrating pairwise scaffold similarity and compound potency differences, the propensity of scaffolds to form multi-target activity or selectivity cliffs and, in addition, the global scaffold potential of individual targets are quantitatively assessed, respectively. Finally, structural relationships between scaffolds are systematically explored. Most scaffolds extracted from active compounds are found to be involved in substructure relationships and/or share topological features with others. These substructure relationships are also compared to, and combined with, hierarchical substructure relationships to facilitate activity prediction

    Computational Analysis of Structure-Activity Relationships : From Prediction to Visualization Methods

    Get PDF
    Understanding how structural modifications affect the biological activity of small molecules is one of the central themes in medicinal chemistry. By no means is structure-activity relationship (SAR) analysis a priori dependent on computational methods. However, as molecular data sets grow in size, we quickly approach our limits to access and compare structures and associated biological properties so that computational data processing and analysis often become essential. Here, different types of approaches of varying complexity for the analysis of SAR information are presented, which can be applied in the context of screening and chemical optimization projects. The first part of this thesis is dedicated to machine-learning strategies that aim at de novo ligand prediction and the preferential detection of potent hits in virtual screening. High emphasis is put on benchmarking of different strategies and a thorough evaluation of their utility in practical applications. However, an often claimed disadvantage of these prediction methods is their "black box" character because they do not necessarily reveal which structural features are associated with biological activity. Therefore, these methods are complemented by more descriptive SAR analysis approaches showing a higher degree of interpretability. Concepts from information theory are adapted to identify activity-relevant structure-derived descriptors. Furthermore, compound data mining methods exploring prespecified properties of available bioactive compounds on a large scale are designed to systematically relate molecular transformations to activity changes. Finally, these approaches are complemented by graphical methods that primarily help to access and visualize SAR data in congeneric series of compounds and allow the formulation of intuitive SAR rules applicable to the design of new compounds. The compendium of SAR analysis tools introduced in this thesis investigates SARs from different perspectives

    Molecular Similarity and Xenobiotic Metabolism

    Get PDF
    MetaPrint2D, a new software tool implementing a data-mining approach for predicting sites of xenobiotic metabolism has been developed. The algorithm is based on a statistical analysis of the occurrences of atom centred circular fingerprints in both substrates and metabolites. This approach has undergone extensive evaluation and been shown to be of comparable accuracy to current best-in-class tools, but is able to make much faster predictions, for the first time enabling chemists to explore the effects of structural modifications on a compound’s metabolism in a highly responsive and interactive manner.MetaPrint2D is able to assign a confidence score to the predictions it generates, based on the availability of relevant data and the degree to which a compound is modelled by the algorithm.In the course of the evaluation of MetaPrint2D a novel metric for assessing the performance of site of metabolism predictions has been introduced. This overcomes the bias introduced by molecule size and the number of sites of metabolism inherent to the most commonly reported metrics used to evaluate site of metabolism predictions.This data mining approach to site of metabolism prediction has been augmented by a set of reaction type definitions to produce MetaPrint2D-React, enabling prediction of the types of transformations a compound is likely to undergo and the metabolites that are formed. This approach has been evaluated against both historical data and metabolic schemes reported in a number of recently published studies. Results suggest that the ability of this method to predict metabolic transformations is highly dependent on the relevance of the training set data to the query compounds.MetaPrint2D has been released as an open source software library, and both MetaPrint2D and MetaPrint2D-React are available for chemists to use through the Unilever Centre for Molecular Science Informatics website.----Boehringer-Ingelhie

    Discovery of new selective antagonists of G-protein coupled receptors of therapeutic interest

    Get PDF
    GPCR are integral membrane receptor proteins that are characterized by heptahelical transmembrane domains connected by intracellular and extracellular loops. GPCRs are an attractive class of proteins for drug discovery, with more than 50% of all drugs regulating GPCR function, and some 30% of these drugs directly target GPCRs. Despite the number of GPCR crystal structures determined recently, they only represent a small fraction of total number of GPCRs known. Homology modelling has been the methodology used to fill the gap. However, the low sequence similarity between targets and templates hampers these studies. Aimed at overcoming these drawbacks template selection and the refinement process were studied in this work. Thus, several atomistic models of rat M3 muscarinic receptor were constructed from human M2 muscarinic receptor, human histamine 1 receptor and bovine rhodopsin receptor as templates. Moreover, in order to determine the effect of ligand in the simulation system, an extra model of M2 receptor was refined with NMS bound inside and an extra model refined without ligand. Results show the sampling time 500ns is adequate simulation time and molecular dynamics simulation of the protein embedded in a lipid bilayer as a refinement process improves on the homology models. Specifically, the refinement process can correct the length of the TM segment of the target receptor; the accuracy of the model greatly depends on the proximity of the template and the target in the phylogenetic tree and finally, the presence of a ligand produces a faster equilibration of the system. This methodology was used to study the pharmacological profile of bradykinin receptors B1 and B2. The B1 receptor was constructed using the chemokine CXC4 and bovine rhodopsin receptors as templates. Antagonists selected for the docking studies include Compound 11, Compound 12, Chroman28, SSR240612, NPV-SAA164 and PS020990. Analysis of the ligand-receptor complexes permitted the definition of a pharmacophore that describes the stereochemical requirements of antagonist binding. For the B2 receptor, a similar procedure was followed using the same template. In this case, the set of compounds used were Fasitibant, FR173657, Anatibant, WIN64338, Bradyzide, CHEMBL442294, and JSM10292. The outcome of this study is summarized in a 3D pharmacophore that explains the observed structure-activity results and provides insight into the design of novel molecules with antagonistic profile. To prove the validity of the pharmacophoric hypotheses, a virtual screening process was carried out. The results of the binding studies show about a 33% success rate with a correlation between the number of pharmacophore points fulfilled and their antagonistic potency. Some of these structures are disclosed in this thesis. Moreover, the B1R and B2R pharmacophores developed were compared and the observed differences permitted to explain the stereochemical requirements for receptor-selective ligands. The final study of this study was to establish a rational explanation for the role of zinc in preventing the dimerization of the serotonin 5-Hydroxytryptamine 1A receptor (5-HT1A) and Galanin receptor 1 (GALR1) involved in depression. Homology modeling was used to build atomistic models of these receptors using the crystallographic structures of 5-HT1B and κ– opioid receptor, respectively. First, prospective zinc binding sites were identified for the 5-HT1A using a molecular probe. Second, heterodimers of the two receptors were constructed with different interfaces: TM4 and TM5; TM6 and TM7; TM1 and TM2. Analysis of the 12 zinc-binding sites and the heterodimer interfaces suggests that there is a coincidence between zinc binding sites and heterodimerization interfaces providing a rational explanation for the role of zinc in the molecular processes associated with heterodimer preventionLos receptores acoplados a proteínas G (GPCRs) son proteínas de membrana que se caracterizan por dominios transmembrana heptahelicoidales conectados por lazos intracelulares y extracelulares. GPCRs son un atractivo grupo de proteínas para el descubrimiento de nuevos fármacos puesto que más del 50% de los medicamentos en el mercado que regulan su función y alrededor del 30% que tienen un GPCR como diana. A pesar del gran número de estructuras cristalográficas de GPCRs que se han determinado recientemente, estas solamente representan una pequeña fracción del número total de GPCRs. La homología de secuencia se utiliza de forma rutinaria para llenar el vacío, sin embargo, la baja identidad de secuencia entre miembros de esta familia obstaculiza estos estudios. Con el objetivo de superar estos inconvenientes, tanto el proceso de selección de la plantilla, como el proceso de refinamiento del modelo han sido estudiados en este trabajo. Se construyeron modelos atómicos del receptor muscarínico M3 de rata a partir del receptor humano M2 muscarínico, del de histamina humano 1 y de la rodopsina bovina como plantilla. Por otra parte, con el fin de determinar el efecto del ligando en el proceso de refinamiento, el receptor M2 fue refinado con el ligando NMS y además se construyó un modelo sin ligando. Los resultados muestran que un tiempo de muestreo 500ns es adecuado y que la dinámica molecular representa un proceso de refinamiento adecuado. Esta metodología se utilizó para estudiar el perfil farmacológico de los receptores de bradiquinina B1 y B2. El receptor B1 se construyó usando los receptores CXC4 de quimoquina y rodopsina bovina como plantillas. Los antagonistas seleccionados para los estudios de anclaje incluyen el Compuesto 11, el Compuesto 12, Chroman28, SSR240612, NVP-SAA164 y PS020990. El análisis de los complejos ligando-receptor permite la definición de un farmacóforo que describe los requisitos estereoquímicos de unión de antagonistas. Para el receptor B2, se siguió un procedimiento similar utilizando las mismas plantillas. En este caso, el conjunto de los compuestos utilizados fueron Fasitibant, FR173657, Anatibant, WIN64338, Bradyzide, CHEMBL442294 y JSM10292. El resultado de este estudio se resume en un farmacóforo 3D que explica los resultados estructura-actividad observados y ofrece información sobre el diseño de nuevas moléculas con el perfil antagonista. Para probar la validez de las hipótesis farmacofóricas, se llevó a cabo un proceso de cribado virtual. Los resultados de los estudios de unión muestran sobre una tasa de éxito del 33% con una correlación entre el número de puntos farmacóforicos cumplido y su potencia antagonista. Algunas de estas estructuras se describen en esta tesis. Por otra parte, los farmacóforos de B1R y B2R desarrollados se compararon y a través de las diferencias observadas explicar los requisitos estereoquımicos para que los ligandos sean selectivos. El estudio final de este trabajo fue el establecer una explicación racional para el papel del zinc en la prevención de la dimerización del receptor de serotonina 5-hidroxitriptamina 1A (5-HT1A) y el receptor galanina 1 (GALR1) que participan en la depresión. Homología de secuencia se utilizó para construir modelos atómicos de estos receptores utilizando las estructuras cristalográficas de los receptores 5-HT 1B y κ de opiáceos, respectivamente. En primer lugar, se identificaron los posibles sitios de unión de zinc para el 5-HT1A usando una sonda molecular. En segundo lugar, los heterodímeros de los dos receptores fueron construidos con diferentes interfaces: TM4 y TM5; TM6 y TM7; TM1 y TM2. El análisis de los 12 sitios de unión de zinc y las interfaces heterodímero sugiere que existe una coincidencia entre los sitios de unión de zinc y las interfaces de heterodimerización que proporcionan una explicación racional para el papel del zinc en los procesos moleculares asociados con la prevención heterodímero.Postprint (published version