20 research outputs found
Introducing Uncertainty in Predictive Modelingî—¸Friend or Foe?
Uncertainty was introduced to chemical descriptors of
16 publicly available data sets to various degrees and in various
ways in order to investigate the effect on the predictive performance
of the state-of-the-art method decision tree ensembles. A number of
strategies to handle uncertainty in decision tree ensembles were evaluated.
The main conclusion of the study is that uncertainty to a large extent
may be introduced in chemical descriptors without impairing the predictive
performance of ensembles and without the predictive performance being
significantly reduced from a practical point of view. The investigation
further showed that even when distributions of uncertain values were
provided, the ensembles method could generate equally effective models
from single-point samples from these distributions. Hence, there seems
to be no advantage in using more elaborate methods for handling uncertainty
in chemical descriptors when using decision tree ensembles as a modeling
method for the considered types of introduced uncertainty
Introducing Uncertainty in Predictive Modelingî—¸Friend or Foe?
Uncertainty was introduced to chemical descriptors of
16 publicly available data sets to various degrees and in various
ways in order to investigate the effect on the predictive performance
of the state-of-the-art method decision tree ensembles. A number of
strategies to handle uncertainty in decision tree ensembles were evaluated.
The main conclusion of the study is that uncertainty to a large extent
may be introduced in chemical descriptors without impairing the predictive
performance of ensembles and without the predictive performance being
significantly reduced from a practical point of view. The investigation
further showed that even when distributions of uncertain values were
provided, the ensembles method could generate equally effective models
from single-point samples from these distributions. Hence, there seems
to be no advantage in using more elaborate methods for handling uncertainty
in chemical descriptors when using decision tree ensembles as a modeling
method for the considered types of introduced uncertainty
<i>In Silico</i> Categorization of <i>in Vivo</i> Intrinsic Clearance Using Machine Learning
Machine
learning has recently become popular and much used within
the life science research domain, e.g., for finding quantitative structure–activity
relationships (QSARs) between molecular structures and different biological
end points. In the work presented here, we have applied orthogonal
partial least-squares (OPLS), principal component analysis (PCA),
and random forests (RF) methods for classification as well as regression
analysis to a publicly available <i>in vivo</i> data set
in order to assess the intrinsic metabolic clearance (CL<sub>int</sub>) in humans. The derived classification models are able to identify
compounds with CL<sub>int</sub> lower and higher than 1500 mL/min,
respectively, with nearly 80% accuracy. The most relevant descriptors
are of lipophilicity and charge/polarizability types. Furthermore,
the accuracy from a classification model based on regression analysis,
using the 1500 mL/min cutoff, is also around 80%. These results suggest
the usefulness of machine learning techniques to derive robust and
predictive models in the area of <i>in vivo</i> ADMET (absorption,
distribution, metabolism, elimination, and toxicity) modeling
MOESM2 of Maximizing gain in high-throughput screening using conformal prediction
Additional file 2. Information about the applied datasets, performance of the predictive models, and evaluation of the gain- cost function for the different datasets and settings
A Pragmatic Approach Using First-Principle Methods to Address Site of Metabolism with Implications for Reactive Metabolite Formation
A majority of xenobiotics are metabolized by cytochrome
P450 (CYP)
enzymes. The discovery of drug candidates with low propensity to form
reactive metabolites and low clearance can be facilitated by understanding
CYP-mediated xenobiotic metabolism. Being able to predict the sites
where reactive metabolites form is beneficial in drug design to produce
drug candidates free of reactive metabolite issues. Herein, we report
a pragmatic protocol using first-principle density functional theory
(DFT) calculations for predicting sites of epoxidation and hydroxylation
of aromatic substrates mediated by CYP. The method is based on the
relative stabilities of the CYP-substrate intermediates or the substrate
epoxides. Consequently, it concerns mainly the electronic reactivity
of the substrates. Comparing to the experimental findings, the presented
protocol gave excellent first-ranked epoxidation site predictions
of 83%, and when the test was extended to CYP-mediated sites of aromatic
hydroxylation, satisfactory results were also obtained (73%). This
indicates that our assumptions are valid and also implies that the
intrinsic reactivities of the substrates are in general more important
than their binding poses in proteins, although the protocol may benefit
from the addition of docking information
Cell viability of the C17.2 cells during exposure of a wide range of concentrations for four different compounds.
<p>The IC10 concentration was calculated and was further used to validate proof of concept of the 30 selected genes. Cells exposed to a) D-mannitol (negative control) b) acrylamide c) methylmercury chloride d) valproic acid sodium salt. The data are presented as the mean of 3 independent experiments preformed in hexaplicates. Results were analyzed using two-way ANOVA followed by Dunnett’s multiple comparisons test. The bars represent the mean ± SEM. *<i>p</i> ≤ 0.05, **<i>p</i> ≤ 0.01, ***<i>p</i> ≤ 0.001 compared to control (cells exposed to only cell medium). The inhibitory concentration 10% (IC10) was determined from nonlinear regression to fit the data to the log(inhibitor) vs response(variable slope) curve using the Hill slope (slope factor), equation Y = Bottom + (Top-Bottom)/(1+10^((LogIC10-X)*HillSlope)) (GraphPad Prism 7.02).</p
Whole genome microarray analysis of neural progenitor C17.2 cells during differentiation and validation of 30 neural mRNA biomarkers for estimation of developmental neurotoxicity
<div><p>Despite its high relevance, developmental neurotoxicity (DNT) is one of the least studied forms of toxicity. Current guidelines for DNT testing are based on <i>in vivo</i> testing and they require extensive resources. Transcriptomic approaches using relevant <i>in vitro</i> models have been suggested as a useful tool for identifying possible DNT-generating compounds. In this study, we performed whole genome microarray analysis on the murine progenitor cell line C17.2 following 5 and 10 days of differentiation. We identified 30 genes that are strongly associated with neural differentiation. The C17.2 cell line can be differentiated into a co-culture of both neurons and neuroglial cells, giving a more relevant picture of the brain than using neuronal cells alone. Among the most highly upregulated genes were genes involved in neurogenesis (CHRDL1), axonal guidance (BMP4), neuronal connectivity (PLXDC2), axonogenesis (RTN4R) and astrocyte differentiation (S100B). The 30 biomarkers were further validated by exposure to non-cytotoxic concentrations of two DNT-inducing compounds (valproic acid and methylmercury) and one neurotoxic chemical possessing a possible DNT activity (acrylamide). Twenty-eight of the 30 biomarkers were altered by at least one of the neurotoxic substances, proving the importance of these biomarkers during differentiation. These results suggest that gene expression profiling using a predefined set of biomarkers could be used as a sensitive tool for initial DNT screening of chemicals. Using a predefined set of mRNA biomarkers, instead of the whole genome, makes this model affordable and high-throughput. The use of such models could help speed up the initial screening of substances, possibly indicating alerts that need to be further studied in more sophisticated models.</p></div
Mapping of the 30 genes selected as important for neural differentiation of the C17.2 cell line.
<p>a) Heatmap of the 30 selected genes for the contrasts 10 days of differentiation (Day 10) vs undifferentiated cells (Day 0), 5 days of differentiation (Day 5) vs undifferentiated and 10 days of differentiation vs 5 days of differentiation are illustrated. Genes are ordered according to average log2(fold change) in the contrast Day 10 vs Day 0. b) Map displaying the biological pathways/networks that the selected genes are involved in according to the IPA database as well as after manual review of published literature.</p
PCA plot of independent experimental seed-outs.
<p>The data clusters according to the different contrasts, i.e. 10 days vs 5 days of differentiation, 10 days vs undifferentiated, 5 days vs undifferentiated, showing robustness of the cell model as well as technical reproducibility. The first two principal components explained 72.5% of the information (variation) of the dataset (for PC1: 55.7%, for PC2: 16.8%).</p