14 research outputs found
A Machine Learning Framework to Improve Rat Clearance Predictions and Inform Physiologically Based Pharmacokinetic Modeling
During drug discovery and development, achieving appropriate
pharmacokinetics
is key to establishment of the efficacy and safety of new drugs. Physiologically
based pharmacokinetic (PBPK) models integrating in vitro-to-in vivo extrapolation have become an essential in silico tool to achieve this goal. In this context, the
most important and probably most challenging pharmacokinetic parameter
to estimate is the clearance. Recent work on high-throughput PBPK
modeling during drug discovery has shown that a good estimate of the
unbound intrinsic clearance (CLint,u,) is the key factor
for useful PBPK application. In this work, three different machine
learning-based strategies were explored to predict the rat CLint,u as the input into PBPK. Therefore, in vivo and in vitro data was collected for a total of
2639 proprietary compounds. The strategies were compared to the standard in vitro bottom-up approach. Using the well-stirred liver
model to back-calculate in vivo CLint,u from in vivo rat clearance and then training a
machine learning model on this CLint,u led to more accurate
clearance predictions (absolute average fold error (AAFE) 3.1 in temporal
cross-validation) than the bottom-up approach (AAFE 3.6-16, depending
on the scaling method) and has the advantage that no experimental in vitro data is needed. However, building a machine learning
model on the bias between the back-calculated in vivo CLint,u and the bottom-up scaled in vitro CLint,u also performed well. For example, using unbound
hepatocyte scaling, adding the bias prediction improved the AAFE in
the temporal cross-validation from 16 for bottom-up to 2.9 together
with the bias prediction. Similarly, the log Pearson r2 improved from 0.1 to 0.29. Although it would still require in vitro measurement of CLint,u., using unbound
scaling for the bottom-up approach, the need for correction of the fu,inc by fu,p data
is circumvented. While the above-described ML models were built on
all data points available per approach, it is discussed that evaluation
comparison across all approaches could only be performed on a subset
because ca. 75% of the molecules had missing or unquantifiable measurements
of the fraction unbound in plasma or in vitro unbound
intrinsic clearance, or they dropped out due to the blood-flow limitation
assumed by the well-stirred model. Advantageously, by predicting CLint,u as the input into PBPK, existing workflows can be reused
and the prediction of the in vivo clearance and other
PK parameters can be improved
Rationalizing Tight Ligand Binding through Cooperative Interaction Networks
Small modifications of the molecular structure of a ligand sometimes cause strong gains in binding affinity to a protein target, rendering a weakly active chemical series suddenly attractive for further optimization. Our goal in this study is to better rationalize and predict the occurrence of such interaction hot-spots in receptor binding sites. To this end, we introduce two new concepts into the computational description of molecular recognition. First, we take a broader view of noncovalent interactions and describe proteināligand binding with a comprehensive set of favorable and unfavorable contact types, including for example halogen bonding and orthogonal multipolar interactions. Second, we go beyond the commonly used pairwise additive treatment of atomic interactions and use a small world network approach to describe how interactions are modulated by their environment. This approach allows us to capture local cooperativity effects and considerably improves the performance of a newly derived empirical scoring function, ScorpionScore. More importantly, however, we demonstrate how an intuitive visualization of key intermolecular interactions, interaction networks, and binding hot-spots supports the identification and rationalization of tight ligand binding
Machine Learning Estimates of Natural Product Conformational Energies
<div><p>Machine learning has been used for estimation of potential energy surfaces to speed up molecular dynamics simulations of small systems. We demonstrate that this approach is feasible for significantly larger, structurally complex molecules, taking the natural product Archazolid A, a potent inhibitor of vacuolar-type ATPase, from the myxobacterium <i>Archangium gephyra</i> as an example. Our model estimates energies of new conformations by exploiting information from previous calculations via Gaussian process regression. Predictive variance is used to assess whether a conformation is in the interpolation region, allowing a controlled trade-off between prediction accuracy and computational speed-up. For energies of relaxed conformations at the density functional level of theory (implicit solvent, DFT/BLYP-disp3/def2-TZVP), mean absolute errors of less than 1 kcal/mol were achieved. The study demonstrates that predictive machine learning models can be developed for structurally complex, pharmaceutically relevant compounds, potentially enabling considerable speed-ups in simulations of larger molecular structures.</p></div
Performance of ML models trained separately on each individual MD run and tested on the other MD runs.
<p>RMSE: root mean square error (kJ/mol), MAE: mean absolute error (kJ/mol), MAE (%): MAE as a percentage of the range of training set energy values, <i>R</i><sup>2</sup>: squared Pearson correlation coefficient.</p
Configuration of the myxobacterial polyketide Archazolid A, a potent inhibitor of vacuolar-type ATPase (V-ATPase).
<p>Configuration of the myxobacterial polyketide Archazolid A, a potent inhibitor of vacuolar-type ATPase (V-ATPase).</p
Performance of ML models trained on randomized subsets of increasing size of the complete MD data.
<p>See <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003400#pcbi-1003400-t001" target="_blank">Table 1</a> for abbreviations.</p
Learning using predictive variance.
<p>Shown is the trade-off between mean absolute error (MAE, solid line, left scale) and number of predicted conformations (<i>m</i>, dashed line, right scale). Results are averaged over all possible orderings of the four MD runs (4!ā=ā24; standard deviations ca. 0.4 kJ/mol and 35 samples). Squared correlation is <i>R</i><sup>2</sup>ā=ā0.99.</p
Predicted vs calculated ĪE values of Archazolid A conformations.
<p>All predictions were obtained by stratified 10-fold cross-validation of the complete MD data. NMR-based conformations <i>c5a</i>, <i>c5b</i>, <i>nmr</i> are marked by red circles (external test data).</p
Reported conformations <i>5a</i>, <i>5b</i> (grey), and <i>nmr</i> (black) of Archazolid A derived from NMR studies [9].
<p>Molecules were superimposed by minimizing root mean square deviation in PyMol (<a href="http://www.pymol.org" target="_blank">www.pymol.org</a>).</p
Projection of MD conformations of Archazolid A onto two dimensions (, ) by principal component analysis.
<p>Shown are distribution of individual conformations (left) and smoothed energy landscape generated by LiSARD <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003400#pcbi.1003400-Reutlinger1" target="_blank">[52]</a> (right). Labels indicate reported NMR-motivated structures (Aā=ā<i>c5a</i>, Bā=ā<i>c5b</i>, Pā=ā<i>nmr</i>) and lowest-energy MD conformations (8, 595, 40). Color coding is from lowest (blue) to highest (red) relative energy.</p