22 research outputs found

    Computational Methods Generating High-Resolution Views of Complex Structure-Activity Relationships

    Get PDF
    The analysis of structure-activity relationships (SARs) of small bioactive compounds is a central task in medicinal chemistry and pharmaceutical research. The study of SARs is in principle not limited to computational methods, however, as data sets rapidly grow in size, advanced computational approaches become indispensable for SAR analysis. Activity landscapes are one of the preferred and widely used computational models to study large-scale SARs. Activity cliffs are cardinal features of activity landscape representations and are thought to contain high SAR information content. This work addresses major challenges in systematic SAR exploration and specifically focuses on the design of novel activity landscape models and comprehensive activity cliff analysis. In the first part of the thesis, two conceptually different activity landscape representations are introduced for compounds active against multiple targets. These models are designed to provide an intuitive graphical access to compounds forming single and multi-target activity cliffs and displaying multi-target SAR characteristics. Further, a systematic analysis of the frequency and distribution of activity cliffs is carried out. In addition, a large-scale data mining effort is designed to quantify and analyze fingerprint-dependent changes in SAR information. The second part of this work is dedicated to the concept of activity cliffs and their utility in the practice of medicinal chemistry. Therefore, a computational approach is introduced to search for detectable SAR advantages associated with activity cliffs. In addition, the question is investigated to what extent activity cliffs might be utilized as starting points in practical compound optimization efforts. Finally, all activity cliff configurations formed by currently available bioactive compounds are thoroughly examined. These configurations are further classified and their frequency of occurrence and target distribution are determined. Furthermore, the activity cliff concept is extended to explore the relation between chemical structures and compound promiscuity. The notion of promiscuity cliffs is introduced to deduce structural modifications that might induce large-magnitude promiscuity effects

    Methods for the Analysis of Matched Molecular Pairs and Chemical Space Representations

    Get PDF
    Compound optimization is a complex process where different properties are optimized to increase the biological activity and therapeutic effects of a molecule. Frequently, the structure of molecules is modified in order to improve their property values. Therefore, computational analysis of the effects of structure modifications on property values is of great importance for the drug discovery process. It is also essential to analyze chemical space, i.e., the set of all chemically feasible molecules, in order to find subsets of molecules that display favorable property values. This thesis aims to expand the computational repertoire to analyze the effect of structure alterations and visualize chemical space. Matched molecular pairs are defined as pairs of compounds that share a large common substructure and only differ by a small chemical transformation. They have been frequently used to study property changes caused by structure modifications. These analyses are expanded in this thesis by studying the effect of chemical transformations on the ionization state and ligand efficiency, both measures of great importance in drug design. Additionally, novel matched molecular pairs based on retrosynthetic rules are developed to increase their utility for prospective use of chemical transformations in compound optimization. Further, new methods based on matched molecular pairs are described to obtain preliminary SAR information of screening hit compounds and predict the potency change caused by a chemical transformation. Visualizations of chemical space are introduced to aid compound optimization efforts. First, principal component plots are used to rationalize a matched molecular pair based multi-objective compound optimization procedure. Then, star coordinate and parallel coordinate plots are introduced to analyze drug-like subspaces, where compounds with favorable property values can be found. Finally, a novel network-based visualization of high-dimensional property space is developed. Concluding, the applications developed in this thesis expand the methodological spectrum of computer-aided compound optimization

    Computational Methods for Structure-Activity Relationship Analysis and Activity Prediction

    Get PDF
    Structure-activity relationship (SAR) analysis of small bioactive compounds is a key task in medicinal chemistry. Traditionally, SARs were established on a case-by-case basis. However, with the arrival of high-throughput screening (HTS) and synthesis techniques, a surge in the size and structural heterogeneity of compound data is seen and the use of computational methods to analyse SARs has become imperative and valuable. In recent years, graphical methods have gained prominence for analysing SARs. The choice of molecular representation and the method of assessing similarities affects the outcome of the SAR analysis. Thus, alternative methods providing distinct points of view of SARs are required. In this thesis, a novel graphical representation utilizing the canonical scaffold-skeleton definition to explore meaningful global and local SAR patterns in compound data is introduced. Furthermore, efforts have been made to go beyond descriptive SAR analysis offered by the graphical methods. SAR features inferred from descriptive methods are utilized for compound activity predictions. In this context, a data structure called SAR matrix (SARM), which is reminiscent of conventional R-group tables, is utilized. SARMs suggest many virtual compounds that represent as of yet unexplored chemical space. These virtual compounds are candidates for further exploration but are too many to prioritize simply on the basis of visual inspection. Conceptually different approaches to enable systematic compound prediction and prioritization are introduced. Much emphasis is put on evolving the predictive ability for prospective compound design. Going beyond SAR analysis, the SARM method has also been adapted to navigate multi-target spaces primarily for analysing compound promiscuity patterns. Thus, the original SARM methodology has been further developed for a variety of medicinal chemistry and chemogenomics applications

    Multi-faceted Structure-Activity Relationship Analysis Using Graphical Representations

    Get PDF
    A core focus in medicinal chemistry is the interpretation of structure-activity relationships (SARs) of small molecules. SAR analysis is typically carried out on a case-by-case basis for compound sets that share activity against a given target. Although SAR investigations are not a priori dependent on computational approaches, limitations imposed by steady rise in activity information have necessitated the use of such methodologies. Moreover, understanding SARs in multi-target space is extremely difficult. Conceptually different computational approaches are reported in this thesis for graphical SAR analysis in single- as well as multi-target space. Activity landscape models are often used to describe the underlying SAR characteristics of compound sets. Theoretical activity landscapes that are reminiscent of topological maps intuitively represent distributions of pair-wise similarity and potency difference information as three-dimensional surfaces. These models provide easy access to identification of various SAR features. Therefore, such landscapes for actual data sets are generated and compared with graph-based representations. Existing graphical data structures are adapted to include mechanism of action information for receptor ligands to facilitate simultaneous SAR and mechanism-related analyses with the objective of identifying structural modifications responsible for switching molecular mechanisms of action. Typically, SAR analysis focuses on systematic pair-wise relationships of compound similarity and potency differences. Therefore, an approach is reported to calculate SAR feature probabilities on the basis of these pair-wise relationships for individual compounds in a ligand set. The consequent expansion of feature categories improves the analysis of local SAR environments. Graphical representations are designed to avoid a dependence on preconceived SAR models. Such representations are suitable for systematic large-scale SAR exploration. Methods for the navigation of SARs in multi-target space using simple and interpretable data structures are introduced. In summary, multi-faceted SAR analysis aided by computational means forms the primary objective of this dissertation

    Chemoinformatics-Driven Approaches for Kinase Drug Discovery

    Get PDF
    Given their importance for the majority of cell physiology processes, protein kinases are among the most extensively studied protein targets in drug discovery. Inappropriate regulation of their basal levels results in pathophysiological disorders. In this regard, small-molecule inhibitors of human kinome have been developed to treat these conditions effectively and improve the survival rates and life quality of patients. In recent years, kinase-related data has become increasingly available in the public domain. These large amounts of data provide a rich knowledge source for the computational studies of kinase drug discovery concepts. This thesis aims to systematically explore properties of kinase inhibitors on the basis of publicly available data. Hence, an established "selectivity versus promiscuity" conundrum of kinase inhibitors is evaluated, close structural analogs with diverging promiscuity levels are analyzed, and machine learning is employed to classify different kinase inhibitor binding modes. In the first study, kinase inhibitor selectivity trends are explored on the kinase pair level where kinase structural features and phylogenetic relationships are used to explain the obtained selectivity information. Next, selectivity of clinical kinase inhibitors is inspected on the basis of cell-based profiling campaign results to consolidate the previous findings. Further, clinical candidates are mapped to medicinal chemistry sources and promiscuity levels of different inhibitor subsets are estimated, including designated chemical probes. Additionally, chemical probe analysis is extended to expert-curated representatives to correlate the views established by scientific community and evaluate their potential for chemical biology applications. Then, large-scale promiscuity analysis of kinase inhibitor data combining several public repositories is performed to subsequently explore promiscuity cliffs (PCs) and PC pathways and study structure-promiscuity relationships. Furthermore, an automated extraction protocol prioritizing the most informative pathways is proposed with focus on those containing promiscuity hubs. In addition, the generated promiscuity data structures including cliffs, pathways, and hubs are discussed for their potential in experimental and computational follow-ups and subsequently made publicly available. Finally, machine learning methods are used to develop classification models of kinase inhibitors with distinct experimental binding modes and their potential for the development of novel therapeutics is assessed
    corecore