24 research outputs found

    Comparison of Confirmed Inactive and Randomly Selected Compounds as Negative Training Examples in Support Vector Machine-Based Virtual Screening

    No full text
    The choice of negative training data for machine learning is a little explored issue in chemoinformatics. In this study, the influence of alternative sets of negative training data and different background databases on support vector machine (SVM) modeling and virtual screening has been investigated. Target-directed SVM models have been derived on the basis of differently composed training sets containing confirmed inactive molecules or randomly selected database compounds as negative training instances. These models were then applied to search background databases consisting of biological screening data or randomly assembled compounds for available hits. Negative training data were found to systematically influence compound recall in virtual screening. In addition, different background databases had a strong influence on the search results. Our findings also indicated that typical benchmark settings lead to an overestimation of SVM-based virtual screening performance compared to search conditions that are more relevant for practical applications

    Introduction of Target Cliffs as a Concept To Identify and Describe Complex Molecular Selectivity Patterns

    No full text
    The study of target specificity or selectivity of small molecules is an important task in drug design. In an ideal situation, a compound would exclusively interact with an individual target and hence be target specific. However, such exclusive binding events are likely to be rare, as increasing evidence suggests. Because many compounds are active against more than one target, apparent selectivity often results from potency differences, i.e., a compound that is highly potent against a given target and weakly potent against one or more others displays target selectivity. In a simple case, a compound might have known activity against a pair of targets and be selective for one over the other. Then, selectivity is straightforward to rationalize. However, there are many more complex selectivity relationships associated with multi-target activities of compounds that are difficult to analyze and compare in a consistent manner. For this purpose, we introduce herein target cliffs as a concept to describe complex selectivity patterns. A target cliff is defined as a pair of targets against which at least one compound displays a large difference in potency. As such, target cliffs are distinct from activity cliffs. However, qualifying target pairs (target cliffs) and compound pairs (activity cliffs) can be systematically extracted from the same data structure termed target-compound matrices. Furthermore, these two types of cliffs can be compared to identify and prioritize compounds that are selective and reveal structure–activity relationship (SAR) information

    Similarity Searching for Potent Compounds Using Feature Selection

    No full text
    In similarity searching, compound potency is usually not taken into account. Given a set of active reference compounds, similarity to database molecules is calculated using different metrics without considering compound potency as a search parameter. Herein, we introduce a feature selection method for fingerprint similarity searching to maximize compound recall and preferentially detect potent compounds. On the basis of training examples, fingerprint features are selected that identify potent compounds and produce high recall. Using the reduced fingerprint representations, potent hits are preferentially detected, even if reference compounds have only moderate or low potency. Small sets of simple chemical features are found to yield high search performance

    Application of a New Scaffold Concept for Computational Target Deconvolution of Chemical Cancer Cell Line Screens

    No full text
    Target deconvolution of phenotypic assays is a hot topic in chemical biology and drug discovery. The ultimate goal is the identification of targets for compounds that produce interesting phenotypic readouts. A variety of experimental and computational strategies have been devised to aid this process. A widely applied computational approach infers putative targets of new active molecules on the basis of their chemical similarity to compounds with activity against known targets. Herein, we introduce a molecular scaffold-based variant for similarity-based target deconvolution from chemical cancer cell line screens that were used as a model system for phenotypic assays. A new scaffold type was used for substructure-based similarity assessment, termed analog series-based (ASB) scaffold. Compared with conventional scaffolds and compound-based similarity calculations, target assignment centered on ASB scaffolds resulting from screening hits and bioactive reference compounds restricted the number of target hypotheses in a meaningful way and lead to a significant enrichment of known cancer targets among candidates

    Composition and Topology of Activity Cliff Clusters Formed by Bioactive Compounds

    No full text
    The assessment of activity cliffs has thus far mostly focused on compound pairs, although the majority of activity cliffs are not formed in isolation but in a coordinated manner involving multiple active compounds and cliffs. However, the composition of coordinated activity cliff configurations and their topologies are unknown. Therefore, we have identified all activity cliff configurations formed by currently available bioactive compounds and analyzed them in network representations where activity cliff configurations occur as clusters. The composition, topology, frequency of occurrence, and target distribution of activity cliff clusters have been determined. A limited number of large cliff clusters with unique topologies were identified that were centers of activity cliff formation. These clusters originated from a small number of target sets. However, most clusters were of small to moderate size. Three basic topologies were sufficient to describe recurrent activity cliff cluster motifs/topologies. For example, frequently occurring clusters with star topology determined the scale-free character of the global activity cliff network and represented a characteristic activity cliff configuration. Large clusters with complex topology were often found to contain different combinations of basic topologies. Our study provides a first view of activity cliff configurations formed by currently available bioactive compounds and of the recurrent topologies of activity cliff clusters. Activity cliff clusters of defined topology can be selected, and from compounds forming the clusters, SAR information can be obtained. The SAR information of activity cliff clusters sharing a/one specific activity and topology can be compared

    Current Compound Coverage of the Kinome

    No full text
    Publicly available kinase inhibitors have been analyzed in detail. Nearly 19000 inhibitors have been identified with activity against 266 different kinases. Thus, about half of the human kinome is currently covered with active small molecules. The distribution of inhibitors across the kinome is uneven. Most available kinase inhibitors are likely to be type I inhibitors. By contrast, type II inhibitors are rare but usually have high potency. Kinase inhibitors generally display high scaffold diversity. Activity cliffs with an at least 100-fold difference in potency are only found for inhibitors of 106 kinases, which is partly due to only small numbers of compounds available for many kinases, in addition to scaffold diversity. Moreover, kinase inhibitors are less promiscuous than often thought. More than 70% of available inhibitors are only annotated with a single kinase activity, and only ∼1% of the inhibitors are active against five or more kinases

    Matched Molecular Pair Analysis of Small Molecule Microarray Data Identifies Promiscuity Cliffs and Reveals Molecular Origins of Extreme Compound Promiscuity

    No full text
    The study of compound promiscuity is a hot topic in medicinal chemistry and drug discovery research. Promiscuous compounds are increasingly identified, but the molecular basis of promiscuity is currently only little understood. Utilizing the matched molecular pair formalism, we have analyzed patterns of compound promiscuity in a publicly available small molecule microarray data set. On the basis of our analysis, we introduce “promiscuity cliffs” as pairs of structural analogs with single-site substitutions that lead to large-magnitude differences in apparent compound promiscuity involving between 50 and 97 unrelated targets. No substructures or substructure transformations have been detected that are generally responsible for introducing promiscuity. However, within a given structural context, small chemical replacements were found to lead to dramatic promiscuity effects. On the basis of our analysis, promiscuity is not an inherent feature of molecular scaffolds but can be induced by small chemical substitutions. Promiscuity cliffs provide immediate access to such modifications

    Prediction of Individual Compounds Forming Activity Cliffs Using Emerging Chemical Patterns

    No full text
    Activity cliffs are formed by structurally similar or analogous compounds having large potency differences. In medicinal chemistry, pairs or groups of compounds forming activity cliffs are of interest for structure–activity relationship (SAR) analysis and compound optimization. Thus far, activity cliff assessment has mostly been descriptive, i.e., compound data sets and activity landscape representations have been searched for activity cliffs in the context of SAR analysis. Only recently, first attempts have also been made to depart from descriptive analysis and predict activity cliffs. This has been done by building computational models that distinguish compound pairs forming activity cliffs from non-cliff pairs. However, it is principally more challenging to predict single compounds that participate in activity cliffs. Here, we show that individual compounds having high or low potency can be accurately predicted to form activity cliffs on the basis of emerging chemical patterns

    Classification of Compounds with Distinct or Overlapping Multi-Target Activities and Diverse Molecular Mechanisms Using Emerging Chemical Patterns

    No full text
    The emerging chemical patterns (ECP) approach has been introduced for compound classification. Thus far, only very few ECP applications have been reported. Here, we further investigate the ECP methodology by studying complex classification problems. The analysis involves multi-target data sets with systematically organized subsets of compounds having distinct or overlapping target activities and, in addition, data sets containing classes of specifically active compounds with different mechanism-of-action. In systematic classification trials focusing on individual compound subsets or mechanistic classes, ECP calculations utilizing numerical descriptors achieve moderate to high sensitivity, dependent on the data set, and consistently high specificity. Accurate ECP predictions are already obtained on the basis of very small learning sets with only three positive training instances, which distinguishes the ECP approach from many other machine learning techniques

    Compound Pathway Model To Capture SAR Progression: Comparison of Activity Cliff-Dependent and -Independent Pathways

    No full text
    A compound pathway model is introduced to monitor SAR progression in compound data sets. Pathways are formed by sequences of structurally analogous compounds with stepwise increasing potency that ultimately yield highly potent compounds. Hence, the model was designed to mimic compound optimization efforts. Different pathway categories were defined. Pathways originating from any active compound in a data set were systematically identified including compounds forming activity cliffs. The relative frequency of activity cliff-dependent and -independent pathways was determined and compared. In 23 of 39 different compound data sets that qualified for our analysis, significant differences in the relative frequency of activity cliff-dependent and -independent pathways were observed. In 17 of these 23 data sets, activity cliff-dependent pathways occurred with higher relative frequency than cliff-independent pathways. In addition, pathways originating from the majority of activity cliff compounds displayed desired SAR progression, reflecting SAR information gain associated with activity cliffs
    corecore