18 research outputs found

    Deep Neural Network Models for Predicting Chemically Induced Liver Toxicity Endpoints From Transcriptomic Responses

    Get PDF
    Improving the accuracy of toxicity prediction models for liver injuries is a key element in evaluating the safety of drugs and chemicals. Mechanism-based information derived from expression (transcriptomic) data, in combination with machine-learning methods, promises to improve the accuracy and robustness of current toxicity prediction models. Deep neural networks (DNNs) have the advantage of automatically assembling the relevant features from a large number of input features. This makes them especially suitable for modeling transcriptomic data, which typically contain thousands of features. Here, we gaged gene- and pathway-level feature selection schemes using single- and multi-task DNN approaches in predicting chemically induced liver injuries (biliary hyperplasia, fibrosis, and necrosis) from whole-genome DNA microarray data. The single-task DNN models showed high predictive accuracy and endpoint specificity, with Matthews correlation coefficients for the three endpoints on 10-fold cross validation ranging from 0.56 to 0.89, with an average of 0.74 in the best feature sets. The DNN models outperformed Random Forest models in cross validation and showed better performance than Support Vector Machine models when tested in the external validation datasets. In the cross validation studies, the effect of the feature selection scheme was negligible among the studied feature sets. Further evaluation of the models on their ability to predict the injury phenotype per se for non-chemically induced injuries revealed the robust performance of the DNN models across these additional external testing datasets. Thus, the DNN models learned features specific to the injury phenotype contained in the gene expression data

    Identification of the Toxicity Pathways Associated With Thioacetamide-Induced Injuries in Rat Liver and Kidney

    Get PDF
    Ingestion or exposure to chemicals poses a serious health risk. Early detection of cellular changes induced by such events is vital to identify appropriate countermeasures to prevent organ damage. We hypothesize that chemically induced organ injuries are uniquely associated with a set (module) of genes exhibiting significant changes in expression. We have previously identified gene modules specifically associated with organ injuries by analyzing gene expression levels in liver and kidney tissue from rats exposed to diverse chemical insults. Here, we assess and validate our injury-associated gene modules by analyzing gene expression data in liver, kidney, and heart tissues obtained from Sprague-Dawley rats exposed to thioacetamide, a known liver toxicant that promotes fibrosis. The rats were injected intraperitoneally with a low (25 mg/kg) or high (100 mg/kg) dose of thioacetamide for 8 or 24 h, and definite organ injury was diagnosed by histopathology. Injury-associated gene modules indicated organ injury specificity, with the liver being most affected by thioacetamide. The most activated liver gene modules were those associated with inflammatory cell infiltration and fibrosis. Previous studies on thioacetamide toxicity and our histological analyses supported these results, signifying the potential of gene expression data to identify organ injuries

    vNN Web Server for ADMET Predictions

    No full text
    In drug development, early assessments of pharmacokinetic and toxic properties are important stepping stones to avoid costly and unnecessary failures. Considerable progress has recently been made in the development of computer-based (in silico) models to estimate such properties. Nonetheless, such models can be further improved in terms of their ability to make predictions more rapidly, easily, and with greater reliability. To address this issue, we have used our vNN method to develop 15 absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction models. These models quickly assess some of the most important properties of potential drug candidates, including their cytotoxicity, mutagenicity, cardiotoxicity, drug-drug interactions, microsomal stability, and likelihood of causing drug-induced liver injury. Here we summarize the ability of each of these models to predict such properties and discuss their overall performance. All of these ADMET models are publically available on our website (https://vnnadmet.bhsai.org/), which also offers the capability of using the vNN method to customize and build new models

    Using the Variable-Nearest Neighbor Method To Identify P‑Glycoprotein Substrates and Inhibitors

    No full text
    Permeability glycoprotein (Pgp) is an essential membrane-bound transporter that efficiently extracts compounds from a cell. As such, it is a critical determinant of the pharmacokinetic properties of drugs. Multidrug resistance in cancer is often associated with overexpression of Pgp, which increases the efflux of chemotherapeutic agents from the cell. This, in turn, may prevent an effective treatment by reducing the effective intracellular concentrations of such agents. Consequently, identifying compounds that can either be transported out of the cell by Pgp (substrates) or impair Pgp function (inhibitors) is of great interest. Herein, using publically available data, we developed quantitative structure–activity relationship (QSAR) models of Pgp substrates and inhibitors. These models employed a variable-nearest neighbor (v-NN) method that calculated the structural similarity between molecules and hence possessed an applicability domain, that is, they used all nearest neighbors that met a minimum similarity constraint. The performance characteristics of these v-NN-based models were comparable or at times superior to those of other model constructs. The best v-NN models for identifying either Pgp substrates or inhibitors showed overall accuracies of >80% and κ values of >0.60 when tested on external data sets with candidate Pgp substrates and inhibitors. The v-NN prediction model with a well-defined applicability domain gave accurate and reliable results. The v-NN method is computationally efficient and requires no retraining of the prediction model when new assay information becomes availablean important feature when keeping QSAR models up-to-date and maintaining their performance at high levels

    General Purpose 2D and 3D Similarity Approach to Identify hERG Blockers

    No full text
    Screening compounds for human ether-à-go-go-related gene (hERG) channel inhibition is an important component of early stage drug development and assessment. In this study, we developed a high-confidence (p-value < 0.01) hERG prediction model based on a combined two-dimensional (2D) and three-dimensional (3D) modeling approach. We developed a 3D similarity conformation approach (SCA) based on examining a limited fixed number of pairwise 3D similarity scores between a query molecule and a set of known hERG blockers. By combining 3D SCA with 2D similarity ensemble approach (SEA) methods, we achieved a maximum sensitivity in hERG inhibition prediction with an accuracy not achieved by either method separately. The combined model achieved 69% sensitivity and 95% specificity on an independent external data set. Further validation showed that the model correctly picked up documented hERG inhibition or interactions among the Food and Drug Administration- approved drugs with the highest similarity scoreswith 18 of 20 correctly identified. The combination of ascertaining 2D and 3D similarity of compounds allowed us to synergistically use 2D fingerprint matching with 3D shape and chemical complementarity matching

    Critically Assessing the Predictive Power of QSAR Models for Human Liver Microsomal Stability

    No full text
    To lower the possibility of late-stage failures in the drug development process, an up-front assessment of absorption, distribution, metabolism, elimination, and toxicity is commonly implemented through a battery of <i>in silico</i> and <i>in vitro</i> assays. As <i>in vitro</i> data is accumulated, <i>in silico</i> quantitative structure–activity relationship (QSAR) models can be trained and used to assess compounds even before they are synthesized. Even though it is generally recognized that QSAR model performance deteriorates over time, rigorous independent studies of model performance deterioration is typically hindered by the lack of publicly available large data sets of structurally diverse compounds. Here, we investigated predictive properties of QSAR models derived from an assembly of publicly available human liver microsomal (HLM) stability data using variable nearest neighbor (<i>v</i>-NN) and random forest (RF) methods. In particular, we evaluated the degree of time-dependent model performance deterioration. Our results show that when evaluated by 10-fold cross-validation with all available HLM data randomly distributed among 10 equal-sized validation groups, we achieved high-quality model performance from both machine-learning methods. However, when we developed HLM models based on when the data appeared and tried to predict data published later, we found that neither method produced predictive models and that their applicability was dramatically reduced. On the other hand, when a small percentage of randomly selected compounds from data published later were included in the training set, performance of both machine-learning methods improved significantly. The implication is that 1) QSAR model quality should be analyzed in a time-dependent manner to assess their true predictive power and 2) it is imperative to retrain models with <i>any</i> up-to-date experimental data to ensure maximum applicability

    Using the Variable-Nearest Neighbor Method To Identify P‑Glycoprotein Substrates and Inhibitors

    No full text
    Permeability glycoprotein (Pgp) is an essential membrane-bound transporter that efficiently extracts compounds from a cell. As such, it is a critical determinant of the pharmacokinetic properties of drugs. Multidrug resistance in cancer is often associated with overexpression of Pgp, which increases the efflux of chemotherapeutic agents from the cell. This, in turn, may prevent an effective treatment by reducing the effective intracellular concentrations of such agents. Consequently, identifying compounds that can either be transported out of the cell by Pgp (substrates) or impair Pgp function (inhibitors) is of great interest. Herein, using publically available data, we developed quantitative structure–activity relationship (QSAR) models of Pgp substrates and inhibitors. These models employed a variable-nearest neighbor (v-NN) method that calculated the structural similarity between molecules and hence possessed an applicability domain, that is, they used all nearest neighbors that met a minimum similarity constraint. The performance characteristics of these v-NN-based models were comparable or at times superior to those of other model constructs. The best v-NN models for identifying either Pgp substrates or inhibitors showed overall accuracies of >80% and κ values of >0.60 when tested on external data sets with candidate Pgp substrates and inhibitors. The v-NN prediction model with a well-defined applicability domain gave accurate and reliable results. The v-NN method is computationally efficient and requires no retraining of the prediction model when new assay information becomes availablean important feature when keeping QSAR models up-to-date and maintaining their performance at high levels

    Concordance between Thioacetamide-Induced Liver Injury in Rat and Human In Vitro Gene Expression Data

    No full text
    The immense resources required and the ethical concerns for animal-based toxicological studies have driven the development of in vitro and in silico approaches. Recently, we validated our approach in which the expression of a set of genes is uniquely associated with an organ-injury phenotype (injury module), by using thioacetamide, a known liver toxicant. Here, we sought to explore whether RNA-seq data obtained from human cells (in vitro) treated with thioacetamide-S-oxide (a toxic intermediate metabolite) would correlate across species with the injury responses found in rat cells (in vitro) after exposure to this metabolite as well as in rats exposed to thioacetamide (in vivo). We treated two human cell types with thioacetamide-S-oxide (primary hepatocytes with 0 (vehicle), 0.125 (low dose), or 0.25 (high dose) mM, and renal tubular epithelial cells with 0 (vehicle), 0.25 (low dose), or 1.00 (high dose) mM) and collected RNA-seq data 9 or 24 h after treatment. We found that the liver-injury modules significantly altered in human hepatocytes 24 h after high-dose treatment involved cellular infiltration and bile duct proliferation, which are linked to fibrosis. For high-dose treatments, our modular approach predicted the rat in vivo and in vitro results from human in vitro RNA-seq data with Pearson correlation coefficients of 0.60 and 0.63, respectively, which was not observed for individual genes or KEGG pathways
    corecore