14 research outputs found

    Analysis of the FLVR motif of SHIP1 and its importance for the protein stability of SH2 containing signaling proteins

    Get PDF
    Under embargo until: 2020-08-02Binding of proteins with SH2 domains to tyrosine-phosphorylated signaling proteins is a key mechanism for transmission of biological signals within the cell. Characterization of dysregulated proteins in cell signaling pathways is important for the development of therapeutic approaches. The AKT pathway is a frequently upregulated pathway in most cancer cells and the SH2-containing inositol 5-phosphatase SHIP1 is a negative regulator of the AKT pathway. In this study we investigated different mutations of the conserved FLVR motif of the SH2 domain and putative phosphorylation sites of SHIP1 which are located in close proximity to its FLVR motif. We demonstrate that patient-derived SHIP1-FLVR motif mutations e.g. F28L, and L29F possess reduced protein expression and increased phospho-AKT-S473 levels in comparison to SHIP1 wildtype. The estimated half-life of SHIP1-F28L protein was reduced from 23.2 h to 0.89 h in TF-1 cells and from 4.7 h to 0.6 h in Jurkat cells. These data indicate that the phenylalanine residue at position 28 of SHIP1 is important for its stability. Replacement of F28 with other aromatic residues like tyrosine and tryptophan preserves protein stability while replacement with non-aromatic amino acids like leucine, isoleucine, valine or alanine severely affects the stability of SHIP1. In consequence, a SHIP1-mutant with an aromatic amino acid at position 28 i.e. F28W can rescue the inhibitory function of wild type SHIP1, whereas SHIP1-mutants with non-aromatic amino acids i.e. F28V do not inhibit cell growth anymore. A detailed structural analysis revealed that F28 forms hydrophobic surface contacts in particular with W5, I83, L97 and P100 which can be maintained by tyrosine and tryptophan residues, but not by non-aromatic residues at position 28. In line with this model of mutation-induced instability of SHIP1-F28L, treatment of cells with proteasomal inhibitor MG132 was able to rescue expression of SHIP1-F28L. In addition, mutation of putative phosphorylation sites S27 and S33 adjacent to the FLVR motif of SHIP1 have an influence on its protein stability. These results further support a functional role of SHIP1 as tumor suppressor protein and indicate a regulation of protein expression of SH2 domain containing proteins via the FLVR motif.acceptedVersio

    Analysis of the FLVR motif of SHIP1 and its importance for the protein stability of SH2 containing signaling proteins

    Get PDF
    Under embargo until: 2020-08-02Binding of proteins with SH2 domains to tyrosine-phosphorylated signaling proteins is a key mechanism for transmission of biological signals within the cell. Characterization of dysregulated proteins in cell signaling pathways is important for the development of therapeutic approaches. The AKT pathway is a frequently upregulated pathway in most cancer cells and the SH2-containing inositol 5-phosphatase SHIP1 is a negative regulator of the AKT pathway. In this study we investigated different mutations of the conserved FLVR motif of the SH2 domain and putative phosphorylation sites of SHIP1 which are located in close proximity to its FLVR motif. We demonstrate that patient-derived SHIP1-FLVR motif mutations e.g. F28L, and L29F possess reduced protein expression and increased phospho-AKT-S473 levels in comparison to SHIP1 wildtype. The estimated half-life of SHIP1-F28L protein was reduced from 23.2 h to 0.89 h in TF-1 cells and from 4.7 h to 0.6 h in Jurkat cells. These data indicate that the phenylalanine residue at position 28 of SHIP1 is important for its stability. Replacement of F28 with other aromatic residues like tyrosine and tryptophan preserves protein stability while replacement with non-aromatic amino acids like leucine, isoleucine, valine or alanine severely affects the stability of SHIP1. In consequence, a SHIP1-mutant with an aromatic amino acid at position 28 i.e. F28W can rescue the inhibitory function of wild type SHIP1, whereas SHIP1-mutants with non-aromatic amino acids i.e. F28V do not inhibit cell growth anymore. A detailed structural analysis revealed that F28 forms hydrophobic surface contacts in particular with W5, I83, L97 and P100 which can be maintained by tyrosine and tryptophan residues, but not by non-aromatic residues at position 28. In line with this model of mutation-induced instability of SHIP1-F28L, treatment of cells with proteasomal inhibitor MG132 was able to rescue expression of SHIP1-F28L. In addition, mutation of putative phosphorylation sites S27 and S33 adjacent to the FLVR motif of SHIP1 have an influence on its protein stability. These results further support a functional role of SHIP1 as tumor suppressor protein and indicate a regulation of protein expression of SH2 domain containing proteins via the FLVR motif.acceptedVersio

    CYPstrate: A Set of Machine Learning Models for the Accurate Classification of Cytochrome P450 Enzyme Substrates and Non-Substrates

    No full text
    The interaction of small organic molecules such as drugs, agrochemicals, and cosmetics with cytochrome P450 enzymes (CYPs) can lead to substantial changes in the bioavailability of active substances and hence consequences with respect to pharmacological efficacy and toxicity. Therefore, efficient means of predicting the interactions of small organic molecules with CYPs are of high importance to a host of different industries. In this work, we present a new set of machine learning models for the classification of xenobiotics into substrates and non-substrates of nine human CYP isozymes: CYPs 1A2, 2A6, 2B6, 2C8, 2C9, 2C19, 2D6, 2E1, and 3A4. The models are trained on an extended, high-quality collection of known substrates and non-substrates and have been subjected to thorough validation. Our results show that the models yield competitive performance and are favorable for the detection of CYP substrates. In particular, a new consensus model reached high performance, with Matthews correlation coefficients (MCCs) between 0.45 (CYP2C8) and 0.85 (CYP3A4), although at the cost of coverage. The best models presented in this work are accessible free of charge via the “CYPstrate” module of the New E-Resource for Drug Discovery (NERDD)

    ALADDIN: Docking Approach Augmented by Machine Learning for Protein Structure Selection Yields Superior Virtual Screening Performance

    Get PDF
    Protein flexibility and solvation pose major challenges to docking algorithms and scoring functions. One established strategy for addressing these challenges is to use multiple protein conformations for docking (all‐against‐all ensemble docking). Recent studies have shown that the performance of ensemble docking can be improved by selecting the most relevant protein structures for docking. In search for a robust approach to protein structure selection, we have come up with an integrated mAchine Learning AnD DockINg approach (ALADDIN). ALADDIN employs a battery of random forest classifiers to select, individually for each compound of interest, from an ensemble of protein structures, the single most suitable protein structure for docking. ALADDIN outperformed the best single‐structure docking runs, ensemble docking and a similarity‐based docking approach on three out of four investigated targets, with up to 0.15, 0.11 and 0.16 higher area under the receiver operating characteristic curve (AUC) values, respectively. Only in the case of cytochrome P450 3A4, ALADDIN, like any of the other tested approaches, failed to obtain decent performance. ALADDIN can be particularly useful for structure‐based virtual screening of malleable proteins, including kinases, some viral enzymes and anti‐targets

    Alignment-Based Prediction of Sites of Metabolism

    No full text
    Prediction of metabolically labile atom positions in a molecule (sites of metabolism) is a key component of the simulation of xenobiotic metabolism as a whole, providing crucial information for the development of safe and effective drugs. In 2008, an exploratory study was published in which sites of metabolism were derived based on molecular shape- and chemical feature-based alignment to a molecule whose site of metabolism (SoM) had been determined by experiments. We present a detailed analysis of the breadth of applicability of alignment-based SoM prediction, including transfer of the approach from a structure- to ligand-based method and extension of the applicability of the models from cytochrome P450 2C9 to all cytochrome P450 isozymes involved in drug metabolism. We evaluate the effect of molecular similarity of the query and reference molecules on the ability of this approach to accurately predict SoMs. In addition, we combine the alignment-based method with a leading chemical reactivity model to take reactivity into account. The combined model yielded superior performance in comparison to the alignment-based approach and the reactivity models with an average area under the receiver operating characteristic curve of 0.85 in cross-validation experiments. In particular, early enrichment was improved, as evidenced by higher BEDROC scores (mean BEDROC = 0.59 for α = 20.0, mean BEDROC = 0.73 for α = 80.5)

    ALADDIN: Docking Approach Augmented by Machine Learning for Protein Structure Selection Yields Superior Virtual Screening Performance

    No full text
    Protein flexibility and solvation pose major challenges to docking algorithms and scoring functions. One established strategy for addressing these challenges is to use multiple protein conformations for docking (all‐against‐all ensemble docking). Recent studies have shown that the performance of ensemble docking can be improved by selecting the most relevant protein structures for docking. In search for a robust approach to protein structure selection, we have come up with an integrated mAchine Learning AnD DockINg approach (ALADDIN). ALADDIN employs a battery of random forest classifiers to select, individually for each compound of interest, from an ensemble of protein structures, the single most suitable protein structure for docking. ALADDIN outperformed the best single‐structure docking runs, ensemble docking and a similarity‐based docking approach on three out of four investigated targets, with up to 0.15, 0.11 and 0.16 higher area under the receiver operating characteristic curve (AUC) values, respectively. Only in the case of cytochrome P450 3A4, ALADDIN, like any of the other tested approaches, failed to obtain decent performance. ALADDIN can be particularly useful for structure‐based virtual screening of malleable proteins, including kinases, some viral enzymes and anti‐targets

    GLORY: Generator of the structures of likely cytochrome P450 metabolites based on predicted sites of metabolism

    Get PDF
    Computational prediction of xenobiotic metabolism can provide valuable information to guide the development of drugs, cosmetics, agrochemicals, and other chemical entities. We have previously developed FAME 2, an effective tool for predicting sites of metabolism (SoMs). In this work, we focus on the prediction of the chemical structures of metabolites, in particular metabolites of xenobiotics. To this end, we have developed a new tool, GLORY, which combines SoM prediction with FAME 2 and a new collection of rules for metabolic reactions mediated by the cytochrome P450 enzyme family. GLORY has two modes: MaxEfficiency and MaxCoverage. For MaxEfficiency mode, the use of predicted SoMs to restrict the locations in the molecule at which the reaction rules could be applied was explored. For MaxCoverage mode, the predicted SoM probabilities were instead used to develop a new scoring approach for the predicted metabolites. With this scoring approach, GLORY achieves a recall of 0.83 and can predict at least one known metabolite within the top three ranked positions for 76% of the molecules of a new, manually curated test set. GLORY is freely available as a web server at https://acm.zbh.uni-hamburg.de/glory/, and the datasets and reaction rules are provided in the Supplementary Material

    Skin Doctor CP : Conformal Prediction of the Skin Sensitization Potential of Small Organic Molecules

    No full text
    Skin sensitization potential or potency is an important end point in the safety assessment of new chemicals and new chemical mixtures. Formerly, animal experiments such as the local lymph node assay (LLNA) were the main form of assessment. Today, however, the focus lies on the development of non-animal testing approaches (i.e., in vitro and in chemico assays) and computational models. In this work, we investigate, based on publicly available LLNA data, the ability of aggregated, Mondrian conformal prediction classifiers to differentiate between non- sensitizing and sensitizing compounds as well as between two levels of skin sensitization potential (weak to moderate sensitizers, and strong to extreme sensitizers). The advantage of the conformal prediction framework over other modeling approaches is that it assigns compounds to activity classes only if a defined minimum level of confidence is reached for the individual predictions. This eliminates the need for applicability domain criteria that often are arbitrary in their nature and less flexible. Our new binary classifier, named Skin Doctor CP, differentiates nonsensitizers from sensitizers with a higher reliability-to-efficiency ratio than the corresponding nonconformal prediction workflow that we presented earlier. When tested on a set of 257 compounds at the significance levels of 0.10 and 0.30, the model reached an efficiency of 0.49 and 0.92, and an accuracy of 0.83 and 0.75, respectively. In addition, we developed a ternary classification workflow to differentiate nonsensitizers, weak to moderate sensitizers, and strong to extreme sensitizers. Although this model achieved satisfactory overall performance (accuracies of 0.90 and 0.73, and efficiencies of 0.42 and 0.90, at significance levels 0.10 and 0.30, respectively), it did not obtain satisfying class-wise results (at a significance level of 0.30, the validities obtained for nonsensitizers, weak to moderate sensitizers, and strong to extreme sensitizers were 0.70, 0.58, and 0.63, respectively). We argue that the model is, in consequence, unable to reliably identify strong to extreme sensitizers and suggest that other ternary models derived from the currently accessible LLNA data might suffer from the same problem. Skin Doctor CP is available via a public web service at https://nerdd.zbh.uni-hamburg.de/skinDoctorII/

    FAME 2: Simple and Effective Machine Learning Model of Cytochrome P450 Regioselectivity

    No full text
    We report on the further development of FAst MEtabolizer (FAME; <i>J. Chem. Inf. Model.</i> <b>2013</b>, <i>53</i>, 2896–2907), a collection of random forest models for the prediction of sites of metabolism (SoMs) of xenobiotics. A broad set of descriptors was explored, from simple 2D descriptors such as those used in FAME, to quantum chemical descriptors employed in some of the most accurate models for SoM prediction currently available. In line with the original FAME approach, our objective was to keep things simple and to come up with accurate and robust models that are based on a small number of 2D descriptors. We found that circular descriptions of atoms and their environments with such descriptors in combination with an extremely randomized trees algorithm can yield models that perform equally well compared to more complex approaches. Thorough evaluation experiments on an independent test set showed that the best of these models obtained a Matthews correlation coefficient, area under the receiver operating characteristic curve, and Top-2 accuracy of 0.57, 0.91 and 94.1%, respectively. Models for the prediction of isoform-specific regioselectivity of CYP 3A4, 2D6, and 2C9 were also developed and showed competitive performance. The best models have been integrated into a newly developed software package (FAME 2), which is available free of charge from the authors

    Benchmarking Commercial Conformer Ensemble Generators

    No full text
    We assess and compare the performance of eight commercial conformer ensemble generators (ConfGen, ConfGenX, cxcalc, iCon, MOE LowModeMD, MOE Stochastic, MOE Conformation Import, and OMEGA) and one leading free algorithm, the distance geometry algorithm implemented in RDKit. The comparative study is based on a new version of the Platinum Diverse Dataset, a high-quality benchmarking dataset of 2859 protein-bound ligand conformations extracted from the PDB. Differences in the performance of commercial algorithms are much smaller than those observed for free algorithms in our previous study (<i>J. Chem. Inf. Model.</i> <b>2017</b>, <i>57</i>, 529–539). For commercial algorithms, the median minimum root-mean-square deviations measured between protein-bound ligand conformations and ensembles of a maximum of 250 conformers are between 0.46 and 0.61 Å. Commercial conformer ensemble generators are characterized by their high robustness, with at least 99% of all input molecules successfully processed and few or even no substantial geometrical errors detectable in their output conformations. The RDKit distance geometry algorithm (with minimization enabled) appears to be a good free alternative since its performance is comparable to that of the midranked commercial algorithms. Based on a statistical analysis, we elaborate on which algorithms to use and how to parametrize them for best performance in different application scenarios
    corecore