18 research outputs found
Deep Neural Network Models for Predicting Chemically Induced Liver Toxicity Endpoints From Transcriptomic Responses
Improving the accuracy of toxicity prediction models for liver injuries is a key element in evaluating the safety of drugs and chemicals. Mechanism-based information derived from expression (transcriptomic) data, in combination with machine-learning methods, promises to improve the accuracy and robustness of current toxicity prediction models. Deep neural networks (DNNs) have the advantage of automatically assembling the relevant features from a large number of input features. This makes them especially suitable for modeling transcriptomic data, which typically contain thousands of features. Here, we gaged gene- and pathway-level feature selection schemes using single- and multi-task DNN approaches in predicting chemically induced liver injuries (biliary hyperplasia, fibrosis, and necrosis) from whole-genome DNA microarray data. The single-task DNN models showed high predictive accuracy and endpoint specificity, with Matthews correlation coefficients for the three endpoints on 10-fold cross validation ranging from 0.56 to 0.89, with an average of 0.74 in the best feature sets. The DNN models outperformed Random Forest models in cross validation and showed better performance than Support Vector Machine models when tested in the external validation datasets. In the cross validation studies, the effect of the feature selection scheme was negligible among the studied feature sets. Further evaluation of the models on their ability to predict the injury phenotype per se for non-chemically induced injuries revealed the robust performance of the DNN models across these additional external testing datasets. Thus, the DNN models learned features specific to the injury phenotype contained in the gene expression data
Identification of the Toxicity Pathways Associated With Thioacetamide-Induced Injuries in Rat Liver and Kidney
Ingestion or exposure to chemicals poses a serious health risk. Early detection of cellular changes induced by such events is vital to identify appropriate countermeasures to prevent organ damage. We hypothesize that chemically induced organ injuries are uniquely associated with a set (module) of genes exhibiting significant changes in expression. We have previously identified gene modules specifically associated with organ injuries by analyzing gene expression levels in liver and kidney tissue from rats exposed to diverse chemical insults. Here, we assess and validate our injury-associated gene modules by analyzing gene expression data in liver, kidney, and heart tissues obtained from Sprague-Dawley rats exposed to thioacetamide, a known liver toxicant that promotes fibrosis. The rats were injected intraperitoneally with a low (25 mg/kg) or high (100 mg/kg) dose of thioacetamide for 8 or 24 h, and definite organ injury was diagnosed by histopathology. Injury-associated gene modules indicated organ injury specificity, with the liver being most affected by thioacetamide. The most activated liver gene modules were those associated with inflammatory cell infiltration and fibrosis. Previous studies on thioacetamide toxicity and our histological analyses supported these results, signifying the potential of gene expression data to identify organ injuries
On the Effect of Low-Energy Electron Induced DNA Strand Break in Aqueous Solution: A Theoretical Study Indicating Guanine as a Weak Link in DNA
vNN Web Server for ADMET Predictions
In drug development, early assessments of pharmacokinetic and toxic properties are important stepping stones to avoid costly and unnecessary failures. Considerable progress has recently been made in the development of computer-based (in silico) models to estimate such properties. Nonetheless, such models can be further improved in terms of their ability to make predictions more rapidly, easily, and with greater reliability. To address this issue, we have used our vNN method to develop 15 absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction models. These models quickly assess some of the most important properties of potential drug candidates, including their cytotoxicity, mutagenicity, cardiotoxicity, drug-drug interactions, microsomal stability, and likelihood of causing drug-induced liver injury. Here we summarize the ability of each of these models to predict such properties and discuss their overall performance. All of these ADMET models are publically available on our website (https://vnnadmet.bhsai.org/), which also offers the capability of using the vNN method to customize and build new models
Molecular Dynamics Simulation of 8-Oxoguanine Containing DNA Fragments Reveals Altered Hydration and Ion Binding Patterns
Using the Variable-Nearest Neighbor Method To Identify P‑Glycoprotein Substrates and Inhibitors
Permeability glycoprotein
(Pgp) is an essential membrane-bound
transporter that efficiently extracts compounds from a cell. As such,
it is a critical determinant of the pharmacokinetic properties of
drugs. Multidrug resistance in cancer is often associated with overexpression
of Pgp, which increases the efflux of chemotherapeutic agents from
the cell. This, in turn, may prevent an effective treatment by reducing
the effective intracellular concentrations of such agents. Consequently,
identifying compounds that can either be transported out of the cell
by Pgp (substrates) or impair Pgp function (inhibitors) is of great
interest. Herein, using publically available data, we developed quantitative
structure–activity relationship (QSAR) models of Pgp substrates
and inhibitors. These models employed a variable-nearest neighbor
(v-NN) method that calculated the structural similarity between molecules
and hence possessed an applicability domain, that is, they used all
nearest neighbors that met a minimum similarity constraint. The performance
characteristics of these v-NN-based models were comparable or at times
superior to those of other model constructs. The best v-NN models
for identifying either Pgp substrates or inhibitors showed overall
accuracies of >80% and Îş values of >0.60 when tested on
external
data sets with candidate Pgp substrates and inhibitors. The v-NN prediction
model with a well-defined applicability domain gave accurate and reliable
results. The v-NN method is computationally efficient and requires
no retraining of the prediction model when new assay information becomes
availableî—¸an important feature when keeping QSAR models up-to-date
and maintaining their performance at high levels
General Purpose 2D and 3D Similarity Approach to Identify hERG Blockers
Screening compounds for human ether-à-go-go-related
gene
(hERG) channel inhibition is an important component of early stage
drug development and assessment. In this study, we developed a high-confidence
(p-value < 0.01) hERG prediction model based on a combined two-dimensional
(2D) and three-dimensional (3D) modeling approach. We developed a
3D similarity conformation approach (SCA) based on examining a limited
fixed number of pairwise 3D similarity scores between a query molecule
and a set of known hERG blockers. By combining 3D SCA with 2D similarity
ensemble approach (SEA) methods, we achieved a maximum sensitivity
in hERG inhibition prediction with an accuracy not achieved by either
method separately. The combined model achieved 69% sensitivity and
95% specificity on an independent external data set. Further validation
showed that the model correctly picked up documented hERG inhibition
or interactions among the Food and Drug Administration- approved drugs
with the highest similarity scoresî—¸with 18 of 20 correctly
identified. The combination of ascertaining 2D and 3D similarity of
compounds allowed us to synergistically use 2D fingerprint matching
with 3D shape and chemical complementarity matching
Critically Assessing the Predictive Power of QSAR Models for Human Liver Microsomal Stability
To
lower the possibility of late-stage failures in the drug development
process, an up-front assessment of absorption, distribution, metabolism,
elimination, and toxicity is commonly implemented through a battery
of <i>in silico</i> and <i>in vitro</i> assays.
As <i>in vitro</i> data is accumulated, <i>in silico</i> quantitative structure–activity relationship (QSAR) models
can be trained and used to assess compounds even before they are synthesized.
Even though it is generally recognized that QSAR model performance
deteriorates over time, rigorous independent studies of model performance
deterioration is typically hindered by the lack of publicly available
large data sets of structurally diverse compounds. Here, we investigated
predictive properties of QSAR models derived from an assembly of publicly
available human liver microsomal (HLM) stability data using variable
nearest neighbor (<i>v</i>-NN) and random forest (RF) methods.
In particular, we evaluated the degree of time-dependent model performance
deterioration. Our results show that when evaluated by 10-fold cross-validation
with all available HLM data randomly distributed among 10 equal-sized
validation groups, we achieved high-quality model performance from
both machine-learning methods. However, when we developed HLM models
based on when the data appeared and tried to predict data published
later, we found that neither method produced predictive models and
that their applicability was dramatically reduced. On the other hand,
when a small percentage of randomly selected compounds from data published
later were included in the training set, performance of both machine-learning
methods improved significantly. The implication is that 1) QSAR model
quality should be analyzed in a time-dependent manner to assess their
true predictive power and 2) it is imperative to retrain models with <i>any</i> up-to-date experimental data to ensure maximum applicability
Using the Variable-Nearest Neighbor Method To Identify P‑Glycoprotein Substrates and Inhibitors
Permeability glycoprotein
(Pgp) is an essential membrane-bound
transporter that efficiently extracts compounds from a cell. As such,
it is a critical determinant of the pharmacokinetic properties of
drugs. Multidrug resistance in cancer is often associated with overexpression
of Pgp, which increases the efflux of chemotherapeutic agents from
the cell. This, in turn, may prevent an effective treatment by reducing
the effective intracellular concentrations of such agents. Consequently,
identifying compounds that can either be transported out of the cell
by Pgp (substrates) or impair Pgp function (inhibitors) is of great
interest. Herein, using publically available data, we developed quantitative
structure–activity relationship (QSAR) models of Pgp substrates
and inhibitors. These models employed a variable-nearest neighbor
(v-NN) method that calculated the structural similarity between molecules
and hence possessed an applicability domain, that is, they used all
nearest neighbors that met a minimum similarity constraint. The performance
characteristics of these v-NN-based models were comparable or at times
superior to those of other model constructs. The best v-NN models
for identifying either Pgp substrates or inhibitors showed overall
accuracies of >80% and Îş values of >0.60 when tested on
external
data sets with candidate Pgp substrates and inhibitors. The v-NN prediction
model with a well-defined applicability domain gave accurate and reliable
results. The v-NN method is computationally efficient and requires
no retraining of the prediction model when new assay information becomes
availableî—¸an important feature when keeping QSAR models up-to-date
and maintaining their performance at high levels
Concordance between Thioacetamide-Induced Liver Injury in Rat and Human In Vitro Gene Expression Data
The immense resources required and the ethical concerns for animal-based toxicological studies have driven the development of in vitro and in silico approaches. Recently, we validated our approach in which the expression of a set of genes is uniquely associated with an organ-injury phenotype (injury module), by using thioacetamide, a known liver toxicant. Here, we sought to explore whether RNA-seq data obtained from human cells (in vitro) treated with thioacetamide-S-oxide (a toxic intermediate metabolite) would correlate across species with the injury responses found in rat cells (in vitro) after exposure to this metabolite as well as in rats exposed to thioacetamide (in vivo). We treated two human cell types with thioacetamide-S-oxide (primary hepatocytes with 0 (vehicle), 0.125 (low dose), or 0.25 (high dose) mM, and renal tubular epithelial cells with 0 (vehicle), 0.25 (low dose), or 1.00 (high dose) mM) and collected RNA-seq data 9 or 24 h after treatment. We found that the liver-injury modules significantly altered in human hepatocytes 24 h after high-dose treatment involved cellular infiltration and bile duct proliferation, which are linked to fibrosis. For high-dose treatments, our modular approach predicted the rat in vivo and in vitro results from human in vitro RNA-seq data with Pearson correlation coefficients of 0.60 and 0.63, respectively, which was not observed for individual genes or KEGG pathways