92 research outputs found

    Target prediction utilising negative bioactivity data covering large chemical space.

    Get PDF
    BACKGROUND: In silico analyses are increasingly being used to support mode-of-action investigations; however many such approaches do not utilise the large amounts of inactive data held in chemogenomic repositories. The objective of this work is concerned with the integration of such bioactivity data in the target prediction of orphan compounds to produce the probability of activity and inactivity for a range of targets. To this end, a novel human bioactivity data set was constructed through the assimilation of over 195 million bioactivity data points deposited in the ChEMBL and PubChem repositories, and the subsequent application of a sphere-exclusion selection algorithm to oversample presumed inactive compounds. RESULTS: A Bernoulli Naïve Bayes algorithm was trained using the data and evaluated using fivefold cross-validation, achieving a mean recall and precision of 67.7 and 63.8 % for active compounds and 99.6 and 99.7 % for inactive compounds, respectively. We show the performances of the models are considerably influenced by the underlying intraclass training similarity, the size of a given class of compounds, and the degree of additional oversampling. The method was also validated using compounds extracted from WOMBAT producing average precision-recall AUC and BEDROC scores of 0.56 and 0.85, respectively. Inactive data points used for this test are based on presumed inactivity, producing an approximated indication of the true extrapolative ability of the models. A distance-based applicability domain analysis was also conducted; indicating an average Tanimoto Coefficient distance of 0.3 or greater between a test and training set can be used to give a global measure of confidence in model predictions. A final comparison to a method trained solely on active data from ChEMBL performed with precision-recall AUC and BEDROC scores of 0.45 and 0.76. CONCLUSIONS: The inclusion of inactive data for model training produces models with superior AUC and improved early recognition capabilities, although the results from internal and external validation of the models show differing performance between the breadth of models. The realised target prediction protocol is available at https://github.com/lhm30/PIDGIN.Graphical abstractThe inclusion of large scale negative training data for in silico target prediction improves the precision and recall AUC and BEDROC scores for target models.The authors thank Krishna C. Bulusu for proof reading the manuscript. LHM would like to thank BBSRC and AstraZeneca and for their funding. GD thanks EPSRC and Eli Lilly for funding.This is the final version of the article. It first appeared from Springer via http://dx.doi.org/10.1186/s13321-015-0098-

    Extending in Silico Protein Target Prediction Models to Include Functional Effects.

    Get PDF
    In silico protein target deconvolution is frequently used for mechanism-of-action investigations; however existing protocols usually do not predict compound functional effects, such as activation or inhibition, upon binding to their protein counterparts. This study is hence concerned with including functional effects in target prediction. To this end, we assimilated a bioactivity training set for 332 targets, comprising 817,239 active data points with unknown functional effect (binding data) and 20,761,260 inactive compounds, along with 226,045 activating and 1,032,439 inhibiting data points from functional screens. Chemical space analysis of the data first showed some separation between compound sets (binding and inhibiting compounds were more similar to each other than both binding and activating or activating and inhibiting compounds), providing a rationale for implementing functional prediction models. We employed three different architectures to predict functional response, ranging from simplistic random forest models ('Arch1') to cascaded models which use separate binding and functional effect classification steps ('Arch2' and 'Arch3'), differing in the way training sets were generated. Fivefold stratified cross-validation outlined cascading predictions provides superior precision and recall based on an internal test set. We next prospectively validated the architectures using a temporal set of 153,467 of in-house data points (after a 4-month interim from initial data extraction). Results outlined Arch3 performed with the highest target class averaged precision and recall scores of 71% and 53%, which we attribute to the use of inactive background sets. Distance-based applicability domain (AD) analysis outlined that Arch3 provides superior extrapolation into novel areas of chemical space, and thus based on the results presented here, propose as the most suitable architecture for the functional effect prediction of small molecules. We finally conclude including functional effects could provide vital insight in future studies, to annotate cases of unanticipated functional changeover, as outlined by our CHRM1 case study.LM thanks the Biotechnology and Biological Sciences Research Council (BBSRC) (BB/K011804/1); and AstraZeneca, grant number RG75821

    Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty.

    Get PDF
    Measurements of protein-ligand interactions have reproducibility limits due to experimental errors. Any model based on such assays will consequentially have such unavoidable errors influencing their performance which should ideally be factored into modelling and output predictions, such as the actual standard deviation of experimental measurements (σ) or the associated comparability of activity values between the aggregated heterogenous activity units (i.e., Ki versus IC50 values) during dataset assimilation. However, experimental errors are usually a neglected aspect of model generation. In order to improve upon the current state-of-the-art, we herein present a novel approach toward predicting protein-ligand interactions using a Probabilistic Random Forest (PRF) classifier. The PRF algorithm was applied toward in silico protein target prediction across ~ 550 tasks from ChEMBL and PubChem. Predictions were evaluated by taking into account various scenarios of experimental standard deviations in both training and test sets and performance was assessed using fivefold stratified shuffled splits for validation. The largest benefit in incorporating the experimental deviation in PRF was observed for data points close to the binary threshold boundary, when such information was not considered in any way in the original RF algorithm. For example, in cases when σ ranged between 0.4-0.6 log units and when ideal probability estimates between 0.4-0.6, the PRF outperformed RF with a median absolute error margin of ~ 17%. In comparison, the baseline RF outperformed PRF for cases with high confidence to belong to the active class (far from the binary decision threshold), although the RF models gave errors smaller than the experimental uncertainty, which could indicate that they were overtrained and/or over-confident. Finally, the PRF models trained with putative inactives decreased the performance compared to PRF models without putative inactives and this could be because putative inactives were not assigned an experimental pXC50 value, and therefore they were considered inactives with a low uncertainty (which in practice might not be true). In conclusion, PRF can be useful for target prediction models in particular for data where class boundaries overlap with the measurement uncertainty, and where a substantial part of the training data is located close to the classification threshold

    Understanding Cytotoxicity and Cytostaticity in a High-Throughput Screening Collection.

    Get PDF
    While mechanisms of cytotoxicity and cytostaticity have been studied extensively from the biological side, relatively little is currently understood regarding areas of chemical space leading to cytotoxicity and cytostasis in large compound collections. Predicting and rationalizing potential adverse mechanism-of-actions (MoAs) of small molecules is however crucial for screening library design, given the link of even low level cytotoxicity and adverse events observed in man. In this study, we analyzed results from a cell-based cytotoxicity screening cascade, comprising 296 970 nontoxic, 5784 cytotoxic and cytostatic, and 2327 cytostatic-only compounds evaluated on the THP-1 cell-line. We employed an in silico MoA analysis protocol, utilizing 9.5 million active and 602 million inactive bioactivity points to generate target predictions, annotate predicted targets with pathways, and calculate enrichment metrics to highlight targets and pathways. Predictions identify known mechanisms for the top ranking targets and pathways for both phenotypes after review and indicate that while processes involved in cytotoxicity versus cytostaticity seem to overlap, differences between both phenotypes seem to exist to some extent. Cytotoxic predictions highlight many kinases, including the potentially novel cytotoxicity-related target STK32C, while cytostatic predictions outline targets linked with response to DNA damage, metabolism, and cytoskeletal machinery. Fragment analysis was also employed to generate a library of toxicophores to improve general understanding of the chemical features driving toxicity. We highlight substructures with potential kinase-dependent and kinase-independent mechanisms of toxicity. We also trained a cytotoxic classification model on proprietary and public compound readouts, and prospectively validated these on 988 novel compounds comprising difficult and trivial testing instances, to establish the applicability domain of models. The proprietary model performed with precision and recall scores of 77.9% and 83.8%, respectively. The MoA results and top ranking substructures with accompanying MoA predictions are available as a platform to assess screening collections.Biotechnology and Biological Sciences Research Council, AstraZenec

    Computer-aided design of multi-target ligands at A1R, A2AR and PDE10A, key proteins in neurodegenerative diseases.

    Get PDF
    Compounds designed to display polypharmacology may have utility in treating complex diseases, where activity at multiple targets is required to produce a clinical effect. In particular, suitable compounds may be useful in treating neurodegenerative diseases by promoting neuronal survival in a synergistic manner via their multi-target activity at the adenosine A1 and A2A receptors (A1R and A2AR) and phosphodiesterase 10A (PDE10A), which modulate intracellular cAMP levels. Hence, in this work we describe a computational method for the design of synthetically feasible ligands that bind to A1 and A2A receptors and inhibit phosphodiesterase 10A (PDE10A), involving a retrosynthetic approach employing in silico target prediction and docking, which may be generally applicable to multi-target compound design at several target classes. This approach has identified 2-aminopyridine-3-carbonitriles as the first multi-target ligands at A1R, A2AR and PDE10A, by showing agreement between the ligand and structure based predictions at these targets. The series were synthesized via an efficient one-pot scheme and validated pharmacologically as A1R/A2AR–PDE10A ligands, with IC50 values of 2.4–10.0 μM at PDE10A and Ki values of 34–294 nM at A1R and/or A2AR. Furthermore, selectivity profiling of the synthesized 2-amino-pyridin-3-carbonitriles against other subtypes of both protein families showed that the multi-target ligand 8 exhibited a minimum of twofold selectivity over all tested off-targets. In addition, both compounds 8 and 16 exhibited the desired multi-target profile, which could be considered for further functional efficacy assessment, analog modification for the improvement of selectivity towards A1R, A2AR and PDE10A collectively, and evaluation of their potential synergy in modulating cAMP levels

    Trisubstituted-imidazoles induce apoptosis in human breast cancer cells by targeting the oncogenic PI3K/Akt/mTOR signaling pathway

    Get PDF
    Overactivation of PI3K/Akt/mTOR is linked with carcinogenesis and serves a potential molecular therapeutic target in treatment of various cancers. Herein, we report the synthesis of trisubstituted-imidazoles and identified 2-chloro-3-(4, 5-diphenyl-1H-imidazol-2-yl) pyridine (CIP) as lead cytotoxic agent. Naïve Base classifier model of in silico target prediction revealed that CIP targets RAC-beta serine/threonine-protein kinase which comprises the Akt. Furthermore, CIP downregulated the phosphorylation of Akt, PDK and mTOR proteins and decreased expression of cyclin D1, Bcl-2, survivin, VEGF, procaspase-3 and increased cleavage of PARP. In addition, CIP significantly downregulated the CXCL12 induced motility of breast cancer cells and molecular docking calculations revealed that all compounds bind to Akt2 kinase with high docking scores compared to the library of previously reported Akt2 inhibitors. In summary, we report the synthesis and biological evaluation of imidazoles that induce apoptosis in breast cancer cells by negatively regulating PI3K/Akt/mTOR signaling pathway

    Mechanisms of photoreceptor death and survival in mammalian retina

    Get PDF
    The mammalian retina, like the rest of the central nervous system, is highly stable and can maintain its structure and function for the full life of the individual, in humans for many decades. Photoreceptor dystrophies are instances of retinal instability. Many are precipitated by genetic mutations and scores of photoreceptor-lethal mutations have now been identified at the codon level. This review explores the factors which make the photoreceptor more vulnerable to small mutations of its proteins than any other cell of the body, and more vulnerable to environmental factors than any other retinal neurone. These factors include the highly specialised structure and function of the photoreceptors, their high appetite for energy, their self-protective mechanisms and the architecture of their energy supply from the choroidal circulation. Particularly important are the properties of the choroidal circulation, especially its fast flow of near-arterial blood and its inability to autoregulate. Mechanisms which make the retina stable and unstable are then reviewed in three different models of retinal degeneration, retinal detachment, photoreceptor dystrophy and light damage. A two stage model of the genesis of photoreceptor dystrophies is proposed, comprising an initial "depletion" stage caused by genetic or environmental insult and a second "late" stage during which oxygen toxicity damages and eventually destroys any photoreceptors which survive the initial depletion. It is a feature of the model that the second "late" stage of retinal dystrophies is driven by oxygen toxicity. The implications of these ideas for therapy of retinal dystrophies are discussed
    • …
    corecore