24 research outputs found
Overview of allosteric models used in the case studies.
<p>In the case of Protein Kinase B a three-class model was created (natural ligand mimicking peptides formed the third class). For class errors, sensitivity is recall of allosteric small molecules, specificity is recall of non-allosteric molecules, and in the case of protein kinase B a third class error (allosteric biological recall) is added. Likewise positive predictive value (PPV) quantifies precision for allosteric small molecules, negative predictive value (NPV) quantifies precision for non-allosteric molecules, and a third value quantifies precision for allosteric biological in the case of Protein Kinase B. In addition to these parameters, the MCC is calculated. Note that the values for the non-MCC parameters in the three-class model have been scaled to the same range as the binary classification models to allow direct comparison. Abbreviations: MCC – Matthews Correlation Coefficient, HIV – Human Immunodeficiency Virus.</p
ROC curves for out-of-bag validation of the allosteric classifier models trained in case studies 2–4.
<p>(A) ROC curve for the HIV-RT classifier. (B) ROC curve for the adenosine receptors classifier. (C) ROC curve for the Protein Kinase B classifier (note that here a ternary model was used as opposed to a binary model).</p
Chemical, Target, and Bioactive Properties of Allosteric Modulation
<div><p>Allosteric modulators are ligands for proteins that exert their effects via a different binding site than the natural (orthosteric) ligand site and hence form a conceptually distinct class of ligands for a target of interest. Here, the physicochemical and structural features of a large set of allosteric and non-allosteric ligands from the ChEMBL database of bioactive molecules are analyzed. In general allosteric modulators are relatively smaller, more lipophilic and more rigid compounds, though large differences exist between different targets and target classes. Furthermore, there are differences in the distribution of targets that bind these allosteric modulators. Allosteric modulators are over-represented in membrane receptors, ligand-gated ion channels and nuclear receptor targets, but are underrepresented in enzymes (primarily proteases and kinases). Moreover, allosteric modulators tend to bind to their targets with a slightly lower potency (5.96 log units versus 6.66 log units, p<0.01). However, this lower absolute affinity is compensated by their lower molecular weight and more lipophilic nature, leading to similar binding efficiency and surface efficiency indices. Subsequently a series of classifier models are trained, initially target class independent models followed by finer-grained target (architecture/functional class) based models using the target hierarchy of the ChEMBL database. Applications of these insights include the selection of likely allosteric modulators from existing compound collections, the design of novel chemical libraries biased towards allosteric regulators and the selection of targets potentially likely to yield allosteric modulators on screening. All data sets used in the paper are available for download.</p></div
L2 target class distribution of both the allosteric (A) and non-allosteric data (B) sets.
<p>The distribution of the target classes differed between the two sets; which confirmed that targets that are easy to hit via non-allosteric inhibitors are not necessarily easy to hit via an allosteric modulator and vice versa. Abbreviations: 7TM1 - Class A GPCRs, 7TM2 - Class B GPCRs, 7TM3 - Class C GPCRs, IP3 - Inositol triphosphate receptors, KIR - Killer-cell Immunoglobulin-like Receptors, LGIC - Ligand Gated Ion Channels, RYR - Ryanodine Receptors, SUR - Sulfonylurea Receptors, TRP - Transient receptor potential channels, VGC - Voltage Gated Ion Channels.</p
(A) Receiver Operator Characteristics (ROC) curve for out-of-bag validation of the allosteric classifier trained on 70% of the allosteric and balanced orthosteric set demonstrated good performance.
<p>(B) External validation on the remaining 30% of the data set confirmed good predictive performance.</p
Bioactivity measurements for allosteric and non-allosteric compounds.
<p>A threshold of 6 log units was used to classify compounds as ‘active’. Abbreviations: MAD – Median Average Deviation, LE – Ligand Efficiency (kcal/mol per non-hydrogen atom), NBEI – Normalized Binding Efficiency Index (non-hydrogen atoms), BEI – Binding Efficiency Index (molecular weight), SEI – Surface Efficiency Index (polar surface area/100), NSEI – Normalized Surface Efficiency Index (polar atoms), <i>n</i>BEI – Normalized Binding Efficiency Index taking the log after calculation of the ratio (non-hydrogen atoms), <i>m</i>BEI – Normalized Binding Efficiency Index taking the log after calculation of the ratio (molecular weight). See Abad-Zapatero <i>et al.</i><a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003559#pcbi.1003559-AbadZapatero2" target="_blank">[52]</a>.</p
Data set composition.
<p>Composition of the data sets generated. The allosteric set was obtained via text mining of abstracts; the non-allosteric (Full) set was the remainder of ChEMBL obtained using the same constraints as the allosteric set (e.g. limit bioactivity to primary assay). The non-allosteric (balanced) set was derived from the non-allosteric (full) set by taking a random percentage of each L2 target class present in the allosteric set. The classes ‘Organic’ and ‘Inorganic’ were subsets of the ‘Small molecules’ class. ‘Peptide’ was a subset of the ‘Biologicals’ class. Abbreviations: L1 – Level 1 target classification, L2 – Level 2 target classification.</p
Binary classification confusion matrix.
<p>Recall values are calculated over the rows and precision values over the columns.</p
Ternary classification confusion matrix.
<p>Recall values are calculated over the rows and precision values over the columns.</p
The concept of multiple binding sites on a single protein visualized schematically (A) and in protein data bank structure 1JSU (B).
<p>The ATP binding site was shown in green on cyclin dependent kinase 2 (CDK2) (grey), commonly referred to as the orthosteric binding site. One allosteric binding site (type V inhibitors) was shown in red, closely located to the orthosteric binding site. Also shown was a non-allosteric inhibitor (green, projected from PDB 1HCK) and an allosteric inhibitor (red, projected from PDB 3PY1). Finally cyclin-A was visualized pairing (light grey) with CDK2 and the natural inhibitor CDKN1B (dark green) to show the potential of allosteric inhibitors to disrupt the CDK2- cyclin-A protein-protein interaction.</p