2 research outputs found

    Comparing the performance of meta-classifiers—a case study on selected imbalanced data sets relevant for prediction of liver toxicity

    No full text
    Cheminformatics datasets used in classification problems, especially those related to biological or physicochemical properties, are often imbalanced. This presents a major challenge in development of in silico prediction models, as the traditional machine learning algorithms are known to work best on balanced datasets. The class imbalance introduces a bias in the performance of these algorithms due to their preference towards the majority class. Here, we present a comparison of the performance of seven different meta-classifiers for their ability to handle imbalanced datasets, whereby Random Forest is used as base-classifier. Four different datasets that are directly (cholestasis) or indirectly (via inhibition of organic anion transporting polypeptide 1B1 and 1B3) related to liver toxicity were chosen for this purpose. The imbalance ratio in these datasets ranges between 4:1 and 20:1 for negative and positive classes, respectively. Three different sets of molecular descriptors for model development were used, and their performance was assessed in 10-fold cross-validation and on an independent validation set. Stratified bagging, MetaCost and CostSensitiveClassifier were found to be the best performing among all the methods. While MetaCost and CostSensitiveClassifier provided better sensitivity values, Stratified Bagging resulted in high balanced accuracies.© The Author(s) 201

    Curated human hyperbilirubinemia data and the respective OATP1B1 and 1B3 inhibition predictions

    No full text
    Hyperbilirubinemia is a pathological condition, very often indicative of underlying liver condition that is characterized by excessive accumulation of conjugated or unconjugated bilirubin in sinusoidal blood. In literature there are several indications associating the inhibition of the basolateral hepatic transporters Organic anion transporting polypeptide 1B1 and 1B3 (OATP1B1 and 1B3) with hyperbilirubinemia. In this article, we present a curated human hyperbilirubinemia dataset and the respective OATP1B1 and 1B3 inhibition predictions obtained from an effort to generate a classification model for hyperbilirubinemia. These data originate from the research article “Linking organic anion transporting polypeptide 1b1 and 1b3 (oatp1b1 and oatp1b3) interaction profiles to hepatotoxicity- the hyperbilirubinemia use case” (E. Kotsampasakou, S.E. Escher, G.F. Ecker, 2017) [1]. We further provide the full list of descriptors used for generating the hyperbilirubinemia classification models as well as the calculated descriptors for each compound of the dataset that was used to build the classification model.© 2017 The Author
    corecore