14 research outputs found

    A Machine Learning Framework to Improve Rat Clearance Predictions and Inform Physiologically Based Pharmacokinetic Modeling

    No full text
    During drug discovery and development, achieving appropriate pharmacokinetics is key to establishment of the efficacy and safety of new drugs. Physiologically based pharmacokinetic (PBPK) models integrating in vitro-to-in vivo extrapolation have become an essential in silico tool to achieve this goal. In this context, the most important and probably most challenging pharmacokinetic parameter to estimate is the clearance. Recent work on high-throughput PBPK modeling during drug discovery has shown that a good estimate of the unbound intrinsic clearance (CLint,u,) is the key factor for useful PBPK application. In this work, three different machine learning-based strategies were explored to predict the rat CLint,u as the input into PBPK. Therefore, in vivo and in vitro data was collected for a total of 2639 proprietary compounds. The strategies were compared to the standard in vitro bottom-up approach. Using the well-stirred liver model to back-calculate in vivo CLint,u from in vivo rat clearance and then training a machine learning model on this CLint,u led to more accurate clearance predictions (absolute average fold error (AAFE) 3.1 in temporal cross-validation) than the bottom-up approach (AAFE 3.6-16, depending on the scaling method) and has the advantage that no experimental in vitro data is needed. However, building a machine learning model on the bias between the back-calculated in vivo CLint,u and the bottom-up scaled in vitro CLint,u also performed well. For example, using unbound hepatocyte scaling, adding the bias prediction improved the AAFE in the temporal cross-validation from 16 for bottom-up to 2.9 together with the bias prediction. Similarly, the log Pearson r2 improved from 0.1 to 0.29. Although it would still require in vitro measurement of CLint,u., using unbound scaling for the bottom-up approach, the need for correction of the fu,inc by fu,p data is circumvented. While the above-described ML models were built on all data points available per approach, it is discussed that evaluation comparison across all approaches could only be performed on a subset because ca. 75% of the molecules had missing or unquantifiable measurements of the fraction unbound in plasma or in vitro unbound intrinsic clearance, or they dropped out due to the blood-flow limitation assumed by the well-stirred model. Advantageously, by predicting CLint,u as the input into PBPK, existing workflows can be reused and the prediction of the in vivo clearance and other PK parameters can be improved

    Rationalizing Tight Ligand Binding through Cooperative Interaction Networks

    No full text
    Small modifications of the molecular structure of a ligand sometimes cause strong gains in binding affinity to a protein target, rendering a weakly active chemical series suddenly attractive for further optimization. Our goal in this study is to better rationalize and predict the occurrence of such interaction hot-spots in receptor binding sites. To this end, we introduce two new concepts into the computational description of molecular recognition. First, we take a broader view of noncovalent interactions and describe proteinā€“ligand binding with a comprehensive set of favorable and unfavorable contact types, including for example halogen bonding and orthogonal multipolar interactions. Second, we go beyond the commonly used pairwise additive treatment of atomic interactions and use a small world network approach to describe how interactions are modulated by their environment. This approach allows us to capture local cooperativity effects and considerably improves the performance of a newly derived empirical scoring function, ScorpionScore. More importantly, however, we demonstrate how an intuitive visualization of key intermolecular interactions, interaction networks, and binding hot-spots supports the identification and rationalization of tight ligand binding

    Machine Learning Estimates of Natural Product Conformational Energies

    Get PDF
    <div><p>Machine learning has been used for estimation of potential energy surfaces to speed up molecular dynamics simulations of small systems. We demonstrate that this approach is feasible for significantly larger, structurally complex molecules, taking the natural product Archazolid A, a potent inhibitor of vacuolar-type ATPase, from the myxobacterium <i>Archangium gephyra</i> as an example. Our model estimates energies of new conformations by exploiting information from previous calculations via Gaussian process regression. Predictive variance is used to assess whether a conformation is in the interpolation region, allowing a controlled trade-off between prediction accuracy and computational speed-up. For energies of relaxed conformations at the density functional level of theory (implicit solvent, DFT/BLYP-disp3/def2-TZVP), mean absolute errors of less than 1 kcal/mol were achieved. The study demonstrates that predictive machine learning models can be developed for structurally complex, pharmaceutically relevant compounds, potentially enabling considerable speed-ups in simulations of larger molecular structures.</p></div

    Performance of ML models trained separately on each individual MD run and tested on the other MD runs.

    No full text
    <p>RMSE: root mean square error (kJ/mol), MAE: mean absolute error (kJ/mol), MAE (%): MAE as a percentage of the range of training set energy values, <i>R</i><sup>2</sup>: squared Pearson correlation coefficient.</p

    Performance of ML models trained on randomized subsets of increasing size of the complete MD data.

    No full text
    <p>See <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003400#pcbi-1003400-t001" target="_blank">Table 1</a> for abbreviations.</p

    Learning using predictive variance.

    No full text
    <p>Shown is the trade-off between mean absolute error (MAE, solid line, left scale) and number of predicted conformations (<i>m</i>, dashed line, right scale). Results are averaged over all possible orderings of the four MD runs (4!ā€Š=ā€Š24; standard deviations ca. 0.4 kJ/mol and 35 samples). Squared correlation is <i>R</i><sup>2</sup>ā€Š=ā€Š0.99.</p

    Predicted vs calculated Ī”E values of Archazolid A conformations.

    No full text
    <p>All predictions were obtained by stratified 10-fold cross-validation of the complete MD data. NMR-based conformations <i>c5a</i>, <i>c5b</i>, <i>nmr</i> are marked by red circles (external test data).</p

    Reported conformations <i>5a</i>, <i>5b</i> (grey), and <i>nmr</i> (black) of Archazolid A derived from NMR studies [9].

    No full text
    <p>Molecules were superimposed by minimizing root mean square deviation in PyMol (<a href="http://www.pymol.org" target="_blank">www.pymol.org</a>).</p

    Projection of MD conformations of Archazolid A onto two dimensions (, ) by principal component analysis.

    No full text
    <p>Shown are distribution of individual conformations (left) and smoothed energy landscape generated by LiSARD <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003400#pcbi.1003400-Reutlinger1" target="_blank">[52]</a> (right). Labels indicate reported NMR-motivated structures (Aā€Š=ā€Š<i>c5a</i>, Bā€Š=ā€Š<i>c5b</i>, Pā€Š=ā€Š<i>nmr</i>) and lowest-energy MD conformations (8, 595, 40). Color coding is from lowest (blue) to highest (red) relative energy.</p
    corecore