8 research outputs found

    Comparing the performance of meta-classifiers—a case study on selected imbalanced data sets relevant for prediction of liver toxicity

    No full text
    Cheminformatics datasets used in classification problems, especially those related to biological or physicochemical properties, are often imbalanced. This presents a major challenge in development of in silico prediction models, as the traditional machine learning algorithms are known to work best on balanced datasets. The class imbalance introduces a bias in the performance of these algorithms due to their preference towards the majority class. Here, we present a comparison of the performance of seven different meta-classifiers for their ability to handle imbalanced datasets, whereby Random Forest is used as base-classifier. Four different datasets that are directly (cholestasis) or indirectly (via inhibition of organic anion transporting polypeptide 1B1 and 1B3) related to liver toxicity were chosen for this purpose. The imbalance ratio in these datasets ranges between 4:1 and 20:1 for negative and positive classes, respectively. Three different sets of molecular descriptors for model development were used, and their performance was assessed in 10-fold cross-validation and on an independent validation set. Stratified bagging, MetaCost and CostSensitiveClassifier were found to be the best performing among all the methods. While MetaCost and CostSensitiveClassifier provided better sensitivity values, Stratified Bagging resulted in high balanced accuracies.© The Author(s) 201

    Structure based classification for bile salt export pump (BSEP) inhibitors using comparative structural modeling of human BSEP

    No full text
    The bile salt export pump (BSEP) actively transports conjugated monovalent bile acids from the hepatocytes into the bile. This facilitates the formation of micelles and promotes digestion and absorption of dietary fat. Inhibition of BSEP leads to decreased bile flow and accumulation of cytotoxic bile salts in the liver. A number of compounds have been identified to interact with BSEP, which results in drug-induced cholestasis or liver injury. Therefore, in silico approaches for flagging compounds as potential BSEP inhibitors would be of high value in the early stage of the drug discovery pipeline. Up to now, due to the lack of a high-resolution X-ray structure of BSEP, in silico based identification of BSEP inhibitors focused on ligand-based approaches. In this study, we provide a homology model for BSEP, developed using the corrected mouse P-glycoprotein structure (PDB ID: 4M1M). Subsequently, the model was used for docking-based classification of a set of 1212 compounds (405 BSEP inhibitors, 807 non-inhibitors). Using the scoring function ChemScore, a prediction accuracy of 81% on the training set and 73% on two external test sets could be obtained. In addition, the applicability domain of the models was assessed based on Euclidean distance. Further, analysis of the protein–ligand interaction fingerprints revealed certain functional group-amino acid residue interactions that could play a key role for ligand binding. Though ligand-based models, due to their high speed and accuracy, remain the method of choice for classification of BSEP inhibitors, structure-assisted docking models demonstrate reasonably good prediction accuracies while additionally providing information about putative protein–ligand interactions.© The Author(s) 201

    Selectivity profiling of BCRP versus P-gp inhibition: from automated collection of polypharmacology data to multi-label learning

    No full text
    Background: The human ATP binding cassette transporters Breast Cancer Resistance Protein (BCRP) and Multidrug Resistance Protein 1 (P-gp) are co-expressed in many tissues and barriers, especially at the blood–brain barrier and at the hepatocyte canalicular membrane. Understanding their interplay in affecting the pharmacokinetics of drugs is of prime interest. In silico tools to predict inhibition and substrate profiles towards BCRP and P-gp might serve as early filters in the drug discovery and development process. However, to build such models, pharmacological data must be collected for both targets, which is a tedious task, often involving manual and poorly reproducible steps. Results: Compounds with inhibitory activity measured against BCRP and/or P-gp were retrieved by combining Open Data and manually curated data from literature using a KNIME workflow. After determination of compound overlap, machine learning approaches were used to establish multi-label classification models for BCRP/P-gp. Different ways of addressing multi-label problems are explored and compared: label-powerset, binary relevance and classifiers chain. Label-powerset revealed important molecular features for selective or polyspecific inhibitory activity. In our dataset, only two descriptors (the numbers of hydrophobic and aromatic atoms) were sufficient to separate selective BCRP inhibitors from selective P-gp inhibitors. Also, dual inhibitors share properties with both groups of selective inhibitors. Binary relevance and classifiers chain allow improving the predictivity of the models. Conclusions: The KNIME workflow proved a useful tool to merge data from diverse sources. It could be used for building multi-label datasets of any set of pharmacological targets for which there is data available either in the open domain or in-house. By applying various multi-label learning algorithms, important molecular features driving transporter selectivity could be retrieved. Finally, using the dataset with missing annotations, predictive models can be derived in cases where no accurate dense dataset is available (not enough data overlap or no well balanced class distribution)

    Empowering pharmacoinformatics by linked life science data

    No full text
    With the public availability of large data sources such as ChEMBLdb and the Open PHACTS Discovery Platform, retrieval of data sets for certain protein targets of interest with consistent assay conditions is no longer a time consuming process. Especially the use of workflow engines such as KNIME or Pipeline Pilot allows complex queries and enables to simultaneously search for several targets. Data can then directly be used as input to various ligand- and structure-based studies. In this contribution, using in-house projects on P-gp inhibition, transporter selectivity, and TRPV1 modulation we outline how the incorporation of linked life science data in the daily execution of projects allowed to expand our approaches from conventional Hansch analysis to complex, integrated multilayer models

    Curated human hyperbilirubinemia data and the respective OATP1B1 and 1B3 inhibition predictions

    No full text
    Hyperbilirubinemia is a pathological condition, very often indicative of underlying liver condition that is characterized by excessive accumulation of conjugated or unconjugated bilirubin in sinusoidal blood. In literature there are several indications associating the inhibition of the basolateral hepatic transporters Organic anion transporting polypeptide 1B1 and 1B3 (OATP1B1 and 1B3) with hyperbilirubinemia. In this article, we present a curated human hyperbilirubinemia dataset and the respective OATP1B1 and 1B3 inhibition predictions obtained from an effort to generate a classification model for hyperbilirubinemia. These data originate from the research article “Linking organic anion transporting polypeptide 1b1 and 1b3 (oatp1b1 and oatp1b3) interaction profiles to hepatotoxicity- the hyperbilirubinemia use case” (E. Kotsampasakou, S.E. Escher, G.F. Ecker, 2017) [1]. We further provide the full list of descriptors used for generating the hyperbilirubinemia classification models as well as the calculated descriptors for each compound of the dataset that was used to build the classification model.© 2017 The Author

    Predicting drug resistance related to ABC transporters using unsupervised Consensus Self-Organizing Maps

    No full text
    ATP binding cassette (ABC) transporters play a pivotal role in drug elimination, particularly on several types of cancer in which these proteins are overexpressed. Due to their promiscuous ligand recognition, building computational models for substrate classification is quite challenging. This study evaluates the use of modified Self-Organizing Maps (SOM) for predicting drug resistance associated with P-gp, MPR1 and BCRP activity. Herein, we present a novel multi-labelled unsupervised classification model which combines a new clustering algorithm with SOM. It significantly improves the accuracy of substrates classification, catching up with traditional supervised machine learning algorithms. Results can be applied to predict the pharmacological profile of new drug candidates during the drug development process.© The Author(s) 201

    Mutational Analysis of the High-Affinity Zinc Binding Site Validates a Refined Human Dopamine Transporter Homology Model

    No full text
    The high-resolution crystal structure of the leucine transporter (LeuT) is frequently used as a template for homology models of the dopamine transporter (DAT). Although similar in structure, DAT differs considerably from LeuT in a number of ways: (i) when compared to LeuT, DAT has very long intracellular amino and carboxyl termini; (ii) LeuT and DAT share a rather low overall sequence identity (22%) and (iii) the extracellular loop 2 (EL2) of DAT is substantially longer than that of LeuT. Extracellular zinc binds to DAT and restricts the transporter‚s movement through the conformational cycle, thereby resulting in a decrease in substrate uptake. Residue H293 in EL2 praticipates in zinc binding and must be modelled correctly to allow for a full understanding of its effects. We exploited the high-affinity zinc binding site endogenously present in DAT to create a model of the complete transmemberane domain of DAT. The zinc binding site provided a DAT-specific molecular ruler for calibration of the model. Our DAT model places EL2 at the transporter lipid interface in the vicinity of the zinc binding site. Based on the model, D206 was predicted to represent a fourth co-ordinating residue, in addition to the three previously described zinc binding residues H193, H375 and E396. This prediction was confirmed by mutagenesis: substitution of D206 by lysine and cysteine affected the inhibitory potency of zinc and the maximum inhibition exerted by zinc, respectively. Conversely, the structural changes observed in the model allowed for rationalizing the zinc-dependent regulation of DAT: upon binding, zinc stabilizes the outward-facing state, because its first coordination shell can only be completed in this conformation. Thus, the model provides a validated solution to the long extracellular loop and may be useful to address other aspects of the transport cycle

    Folding correction of ABC-transporter ABCB1 by pharmacological chaperones: a mechanistic concept

    No full text
    Point mutations of ATP‐binding cassette (ABC) proteins are a common cause of human diseases. Available crystal structures indicate a similarity in the architecture of several members of this protein family. Their molecular architecture makes these proteins vulnerable to mutation, when critical structural elements are affected. The latter preferentially involve the two transmembrane domain (TMD)/nucleotide‐binding domain (NBD) interfaces (transmission interfaces), formation of which requires engagement of coupling helices of intracellular loops with NBDs. Both, formation of the active sites and engagement of the coupling helices, are contingent on correct positioning of ICLs 2 and 4 and thus an important prerequisite for proper folding. Here, we show that active site compounds are capable of rescuing P‐glycoprotein (P‐gp) mutants ∆Y490 and ∆Y1133 in a concentration‐dependent manner. These trafficking deficient mutations are located at the transmission interface in pseudosymmetric position to each other. In addition, the ability of propafenone analogs to correct folding correlates with their ability to inhibit transport of model substrates. This finding indicates that folding correction and transport inhibition by propafenone analogs are brought about by binding to the active sites. Furthermore, this study demonstrates an asymmetry in folding correction with cis‐flupentixol, which reflects the asymmetric binding properties of this modulator to P‐gp. Our results suggest a mechanistic model for corrector action in a model ABC transporter based on insights into the molecular architecture of these transporters.© 2017 The Author
    corecore