1,628 research outputs found

    Multi-faceted Structure-Activity Relationship Analysis Using Graphical Representations

    Get PDF
    A core focus in medicinal chemistry is the interpretation of structure-activity relationships (SARs) of small molecules. SAR analysis is typically carried out on a case-by-case basis for compound sets that share activity against a given target. Although SAR investigations are not a priori dependent on computational approaches, limitations imposed by steady rise in activity information have necessitated the use of such methodologies. Moreover, understanding SARs in multi-target space is extremely difficult. Conceptually different computational approaches are reported in this thesis for graphical SAR analysis in single- as well as multi-target space. Activity landscape models are often used to describe the underlying SAR characteristics of compound sets. Theoretical activity landscapes that are reminiscent of topological maps intuitively represent distributions of pair-wise similarity and potency difference information as three-dimensional surfaces. These models provide easy access to identification of various SAR features. Therefore, such landscapes for actual data sets are generated and compared with graph-based representations. Existing graphical data structures are adapted to include mechanism of action information for receptor ligands to facilitate simultaneous SAR and mechanism-related analyses with the objective of identifying structural modifications responsible for switching molecular mechanisms of action. Typically, SAR analysis focuses on systematic pair-wise relationships of compound similarity and potency differences. Therefore, an approach is reported to calculate SAR feature probabilities on the basis of these pair-wise relationships for individual compounds in a ligand set. The consequent expansion of feature categories improves the analysis of local SAR environments. Graphical representations are designed to avoid a dependence on preconceived SAR models. Such representations are suitable for systematic large-scale SAR exploration. Methods for the navigation of SARs in multi-target space using simple and interpretable data structures are introduced. In summary, multi-faceted SAR analysis aided by computational means forms the primary objective of this dissertation

    Systematic Computational Analysis of Structure-Activity Relationships

    Get PDF
    The exploration of structure–activity relationships (SARs) of small bioactive molecules is a central task in medicinal chemistry. Typically, SARs are analyzed on a case-by-case basis for series of closely related molecules. Classical methods that explore SARs include quantitative SAR (QSAR) modeling and molecular similarity analysis. These methods conceptually rely on the similarity–property principle which states that similar molecules should also have similar biological activity. Although this principle is intuitive and supported by a wealth of observations, it is well-recognized that SARs can have fundamentally different character. Small chemical modifications of active molecules often dramatically alter biological responses, giving rise to “activity cliffs” and “discontinuous” SARs. By contrast, structurally diverse molecules can have similar activity, a situation that is indicative of “continuous” SARs. The combination of continuous and discontinuous components characterizes “heterogeneous” SARs, a phenotype that is frequently encountered in medicinal chemistry. This thesis focuses on the systematic computational analysis of SARs present in sets of active molecules. Approaches to quantitatively describe, classify, and compare SARs at multiple levels of detail are introduced. Initially, a comparative study of crystallographic enzyme–inhibitor complexes is presented that relates two-dimensional and three-dimensional inhibitor similarity and potency to each other. The analysis reveals the presence of systematic and in part unexpected relationships between molecular similarity and potency and explains why apparently inconsistent SARs can coexist in compound activity classes. For the systematic characterization of complex SARs, a numerical function termed SAR Index (SARI) is developed that quantitatively describes continuous and discontinuous SAR components present in sets of active molecules. On the basis of two-dimensional molecular similarity and potency, SARI distinguishes between the three basic SAR categories described above. Heterogeneous SARs are further divided into two previously unobserved subtypes that are distinguished by the way they combine different SAR features. SARI profiling of various enzyme inhibitor classes demonstrates the prevalence of heterogeneous SARs for many classes. Furthermore, control calculations are conducted in order to assess the influence of molecular representation and data set size on SARI scoring. It is shown that SARI scores remain largely stable in response to variation of these critical parameters. Based on the SARI formalism, a methodology is developed to study multiple global and local SAR components of compound activity classes. The approach combines graphical analysis of Network-like Similarity Graphs (NSGs) and SARI score calculations at multiple levels of detail. Compound classes of different global SAR character are found to produce distinct network topologies. Local SAR features are studied in subsets of similar compounds and systematically related to global SAR character. Furthermore, key compounds are identified that are major determinants of local and global SAR characteristics. The approach is also applied to study structure–selectivity relationships (SSRs). Compound selectivity often results from potency differences for multiple targets and presents a critical factor in lead optimization projects. Here, SSRs are explored for sets of compounds that are active against pairs of related targets. For this purpose, the molecular network approach is adapted to the evaluation of SSRs. Results show that SSRs can be quantitatively described and categorized in analogy to single-target SARs. In addition, local SSR environments are identified and compared to SAR features. Within these environments, key compounds are identified that determine characteristic features of single-target SARs and dual-target SSRs. Comparison of similar compounds that have significantly different selectivity reveals chemical modifications that render compounds target-selective. Furthermore, a methodology is introduced to study SAR contributions from functional groups and substitution sites in series of analogous molecules. Analog series are systematically organized according to substitution sites in a hierarchical data structure termed Combinatorial Analog Graph (CAG), and the SARI scoring scheme is applied to evaluate SAR contributions of variable functional groups at specific substitution sites. Combinations of sites that determine SARs within analog series and make large contributions to SAR discontinuity are identified. These sites are prime targets for further chemical modification. In addition to determining key substitution patterns, CAG analysis also identifies substitution sites that have not been thoroughly explored

    Analyzing multitarget activity landscapes using protein-ligand interaction fingerprints: interaction cliffs.

    Get PDF
    This is the original submitted version, before peer review. The final peer-reviewed version is available from ACS at http://pubs.acs.org/doi/abs/10.1021/ci500721x.Activity landscape modeling is mostly a descriptive technique that allows rationalizing continuous and discontinuous SARs. Nevertheless, the interpretation of some landscape features, especially of activity cliffs, is not straightforward. As the nature of activity cliffs depends on the ligand and the target, information regarding both should be included in the analysis. A specific way to include this information is using protein-ligand interaction fingerprints (IFPs). In this paper we report the activity landscape modeling of 507 ligand-kinase complexes (from the KLIFS database) including IFP, which facilitates the analysis and interpretation of activity cliffs. Here we introduce the structure-activity-interaction similarity (SAIS) maps that incorporate information on ligand-target contact similarity. We also introduce the concept of interaction cliffs defined as ligand-target complexes with high structural and interaction similarity but have a large potency difference of the ligands. Moreover, the information retrieved regarding the specific interaction allowed the identification of activity cliff hot spots, which help to rationalize activity cliffs from the target point of view. In general, the information provided by IFPs provides a structure-based understanding of some activity landscape features. This paper shows examples of analyses that can be carried out when IFPs are added to the activity landscape model.M-L is very grateful to CONACyT (No. 217442/312933) and the Cambridge Overseas Trust for funding. AB thanks Unilever for funding and the European Research Council for a Starting Grant (ERC-2013- StG-336159 MIXTURE). J.L.M-F. is grateful to the School of Chemistry, Department of Pharmacy of the National Autonomous University of Mexico (UNAM) for support. This work was supported by a scholarship from the Secretariat of Public Education and the Mexican government

    Computational Methods for the Integration of Biological Activity and Chemical Space

    Get PDF
    One general aim of medicinal chemistry is the understanding of structure-activity relationships of ligands that bind to biological targets. Advances in combinatorial chemistry and biological screening technologies allow the analysis of ligand-target relationships on a large-scale. However, in order to extract useful information from biological activity data, computational methods are needed that link activity of ligands to their chemical structure. In this thesis, it is investigated how fragment-type descriptors of molecular structure can be used in order to create a link between activity and chemical ligand space. First, an activity class-dependent hierarchical fragmentation scheme is introduced that generates fragmentation pathways that are aligned using established methodologies for multiple alignment of biological sequences. These alignments are then used to extract consensus fragment sequences that serve as a structural signature for individual biological activity classes. It is also investigated how defined, chemically intuitive molecular fragments can be organized based on their topological environment and co-occurrence in compounds active against closely related targets. Therefore, the Topological Fragment Index is introduced that quantifies the topological environment complexity of a fragment in a given molecule, and thus goes beyond fragment frequency analysis. Fragment dependencies have been established on the basis of common topological environments, which facilitates the identification of activity class-characteristic fragment dependency pathways that describe fragment relationships beyond structural resemblance. Because fragments are often dependent on each other in an activity class-specific manner, the importance of defined fragment combinations for similarity searching is further assessed. Therefore, Feature Co-occurrence Networks are introduced that allow the identification of feature cliques characteristic of individual activity classes. Three differently designed molecular fingerprints are compared for their ability to provide such cliques and a clique-based similarity searching strategy is established. For molecule- and activity class-centric fingerprint designs, feature combinations are shown to improve similarity search performance in comparison to standard methods. Moreover, it is demonstrated that individual features can form activity-class specific combinations. Extending the analysis of feature cliques characteristic of individual activity classes, the distribution of defined fragment combinations among several compound classes acting against closely related targets is assessed. Fragment Formal Concept Analysis is introduced for flexible mining of complex structure-activity relationships. It allows the interactive assembly of fragment queries that yield fragment combinations characteristic of defined activity and potency profiles. It is shown that pairs and triplets, rather than individual fragments distinguish between different activity profiles. A classifier is built based on these fragment signatures that distinguishes between ligands of closely related targets. Going beyond activity profiles, compound selectivity is also analyzed. Therefore, Molecular Formal Concept Analysis is introduced for the systematic mining of compound selectivity profiles on a whole-molecule basis. Using this approach, structurally diverse compounds are identified that share a selectivity profile with selected template compounds. Structure-selectivity relationships of obtained compound sets are further analyzed

    Computational Methods Generating High-Resolution Views of Complex Structure-Activity Relationships

    Get PDF
    The analysis of structure-activity relationships (SARs) of small bioactive compounds is a central task in medicinal chemistry and pharmaceutical research. The study of SARs is in principle not limited to computational methods, however, as data sets rapidly grow in size, advanced computational approaches become indispensable for SAR analysis. Activity landscapes are one of the preferred and widely used computational models to study large-scale SARs. Activity cliffs are cardinal features of activity landscape representations and are thought to contain high SAR information content. This work addresses major challenges in systematic SAR exploration and specifically focuses on the design of novel activity landscape models and comprehensive activity cliff analysis. In the first part of the thesis, two conceptually different activity landscape representations are introduced for compounds active against multiple targets. These models are designed to provide an intuitive graphical access to compounds forming single and multi-target activity cliffs and displaying multi-target SAR characteristics. Further, a systematic analysis of the frequency and distribution of activity cliffs is carried out. In addition, a large-scale data mining effort is designed to quantify and analyze fingerprint-dependent changes in SAR information. The second part of this work is dedicated to the concept of activity cliffs and their utility in the practice of medicinal chemistry. Therefore, a computational approach is introduced to search for detectable SAR advantages associated with activity cliffs. In addition, the question is investigated to what extent activity cliffs might be utilized as starting points in practical compound optimization efforts. Finally, all activity cliff configurations formed by currently available bioactive compounds are thoroughly examined. These configurations are further classified and their frequency of occurrence and target distribution are determined. Furthermore, the activity cliff concept is extended to explore the relation between chemical structures and compound promiscuity. The notion of promiscuity cliffs is introduced to deduce structural modifications that might induce large-magnitude promiscuity effects

    Computational Analysis of Structure-Activity Relationships : From Prediction to Visualization Methods

    Get PDF
    Understanding how structural modifications affect the biological activity of small molecules is one of the central themes in medicinal chemistry. By no means is structure-activity relationship (SAR) analysis a priori dependent on computational methods. However, as molecular data sets grow in size, we quickly approach our limits to access and compare structures and associated biological properties so that computational data processing and analysis often become essential. Here, different types of approaches of varying complexity for the analysis of SAR information are presented, which can be applied in the context of screening and chemical optimization projects. The first part of this thesis is dedicated to machine-learning strategies that aim at de novo ligand prediction and the preferential detection of potent hits in virtual screening. High emphasis is put on benchmarking of different strategies and a thorough evaluation of their utility in practical applications. However, an often claimed disadvantage of these prediction methods is their "black box" character because they do not necessarily reveal which structural features are associated with biological activity. Therefore, these methods are complemented by more descriptive SAR analysis approaches showing a higher degree of interpretability. Concepts from information theory are adapted to identify activity-relevant structure-derived descriptors. Furthermore, compound data mining methods exploring prespecified properties of available bioactive compounds on a large scale are designed to systematically relate molecular transformations to activity changes. Finally, these approaches are complemented by graphical methods that primarily help to access and visualize SAR data in congeneric series of compounds and allow the formulation of intuitive SAR rules applicable to the design of new compounds. The compendium of SAR analysis tools introduced in this thesis investigates SARs from different perspectives

    Chemoinformatics-Driven Approaches for Kinase Drug Discovery

    Get PDF
    Given their importance for the majority of cell physiology processes, protein kinases are among the most extensively studied protein targets in drug discovery. Inappropriate regulation of their basal levels results in pathophysiological disorders. In this regard, small-molecule inhibitors of human kinome have been developed to treat these conditions effectively and improve the survival rates and life quality of patients. In recent years, kinase-related data has become increasingly available in the public domain. These large amounts of data provide a rich knowledge source for the computational studies of kinase drug discovery concepts. This thesis aims to systematically explore properties of kinase inhibitors on the basis of publicly available data. Hence, an established "selectivity versus promiscuity" conundrum of kinase inhibitors is evaluated, close structural analogs with diverging promiscuity levels are analyzed, and machine learning is employed to classify different kinase inhibitor binding modes. In the first study, kinase inhibitor selectivity trends are explored on the kinase pair level where kinase structural features and phylogenetic relationships are used to explain the obtained selectivity information. Next, selectivity of clinical kinase inhibitors is inspected on the basis of cell-based profiling campaign results to consolidate the previous findings. Further, clinical candidates are mapped to medicinal chemistry sources and promiscuity levels of different inhibitor subsets are estimated, including designated chemical probes. Additionally, chemical probe analysis is extended to expert-curated representatives to correlate the views established by scientific community and evaluate their potential for chemical biology applications. Then, large-scale promiscuity analysis of kinase inhibitor data combining several public repositories is performed to subsequently explore promiscuity cliffs (PCs) and PC pathways and study structure-promiscuity relationships. Furthermore, an automated extraction protocol prioritizing the most informative pathways is proposed with focus on those containing promiscuity hubs. In addition, the generated promiscuity data structures including cliffs, pathways, and hubs are discussed for their potential in experimental and computational follow-ups and subsequently made publicly available. Finally, machine learning methods are used to develop classification models of kinase inhibitors with distinct experimental binding modes and their potential for the development of novel therapeutics is assessed

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF
    • …
    corecore