11 research outputs found

    Capturing the Crystal: Prediction of Enthalpy of Sublimation, Crystal Lattice Energy, and Melting Points of Organic Compounds

    No full text
    Accurate computational prediction of melting points and aqueous solubilities of organic compounds would be very useful but is notoriously difficult. Predicting the lattice energies of compounds is key to understanding and predicting their melting behavior and ultimately their solubility behavior. We report robust, predictive, quantitative structure–property relationship (QSPR) models for enthalpies of sublimation, crystal lattice energies, and melting points for a very large and structurally diverse set of small organic compounds. Sparse Bayesian feature selection and machine learning methods were employed to select the most relevant molecular descriptors for the model and to generate parsimonious quantitative models. The final enthalpy of sublimation model is a four-parameter multilinear equation that has an r<sup>2</sup> value of 0.96 and an average absolute error of 7.9 ± 0.3 kJ.mol<sup>–1</sup>. The melting point model can predict this property with a standard error of 45° ± 1 K and r<sup>2</sup> value of 0.79. Given the size and diversity of the training data, these conceptually transparent and accurate models can be used to predict sublimation enthalpy, lattice energy, and melting points of organic compounds in general

    Capturing the Crystal: Prediction of Enthalpy of Sublimation, Crystal Lattice Energy, and Melting Points of Organic Compounds

    No full text
    Accurate computational prediction of melting points and aqueous solubilities of organic compounds would be very useful but is notoriously difficult. Predicting the lattice energies of compounds is key to understanding and predicting their melting behavior and ultimately their solubility behavior. We report robust, predictive, quantitative structure–property relationship (QSPR) models for enthalpies of sublimation, crystal lattice energies, and melting points for a very large and structurally diverse set of small organic compounds. Sparse Bayesian feature selection and machine learning methods were employed to select the most relevant molecular descriptors for the model and to generate parsimonious quantitative models. The final enthalpy of sublimation model is a four-parameter multilinear equation that has an r<sup>2</sup> value of 0.96 and an average absolute error of 7.9 ± 0.3 kJ.mol<sup>–1</sup>. The melting point model can predict this property with a standard error of 45° ± 1 K and r<sup>2</sup> value of 0.79. Given the size and diversity of the training data, these conceptually transparent and accurate models can be used to predict sublimation enthalpy, lattice energy, and melting points of organic compounds in general

    Aqueous Solubility Prediction: Do Crystal Lattice Interactions Help?

    No full text
    Aqueous solubility is a very important physical property of small molecule drugs and drug candidates but also one of the most difficult to predict accurately. Aqueous solubility plays a major role in drug delivery and pharmacokinetics. It is believed that crystal lattice interactions are important in solubility and that including them in solubility models should improve the accuracy of the models. We used calculated values for lattice energy and sublimation enthalpy of organic molecules as descriptors to determine whether these would improve the accuracy of the aqueous solubility models. Multiple linear regression employing an expectation maximization algorithm and a sparse prior (MLREM) method and a nonlinear Bayesian regularized artificial neural network with a Laplacian prior (BRANNLP) were used to derive optimal predictive models of aqueous solubility of a large and highly diverse data set of 4558 organic compounds over a normal ambient temperature range of 20–30 °C (293–303 K). A randomly selected test set and compounds from a solubility challenge were used to estimate the predictive ability of the models. The BRANNLP method showed the best statistical results with squared correlation coefficients of 0.90 and standard errors of 0.645–0.665 log­(<i>S</i>) for training and test sets. Surprisingly, including descriptors that captured crystal lattice interactions did not significantly improve the quality of these aqueous solubility models

    Beware of <i>R</i><sup>2</sup>: Simple, Unambiguous Assessment of the Prediction Accuracy of QSAR and QSPR Models

    Get PDF
    The statistical metrics used to characterize the external predictivity of a model, i.e., how well it predicts the properties of an independent test set, have proliferated over the past decade. This paper clarifies some apparent confusion over the use of the coefficient of determination, <i>R</i><sup>2</sup>, as a measure of model fit and predictive power in QSAR and QSPR modeling. <i>R</i><sup>2</sup> (or <i>r</i><sup>2</sup>) has been used in various contexts in the literature in conjunction with training and test data for both ordinary linear regression and regression through the origin as well as with linear and nonlinear regression models. We analyze the widely adopted model fit criteria suggested by Golbraikh and Tropsha (J. Mol. Graphics Modell. 2002, 20, 269−276) in a strict statistical manner. Shortcomings in these criteria are identified, and a clearer and simpler alternative method to characterize model predictivity is provided. The intent is not to repeat the well-documented arguments for model validation using test data but rather to guide the application of <i>R</i><sup>2</sup> as a model fit statistic. Examples are used to illustrate both correct and incorrect uses of <i>R</i><sup>2</sup>. Reporting the root-mean-square error or equivalent measures of dispersion, which are typically of more practical importance than <i>R</i><sup>2</sup>, is also encouraged, and important challenges in addressing the needs of different categories of users such as computational chemists, experimental scientists, and regulatory decision support specialists are outlined

    Modeling the Influence of Fatty Acid Incorporation on Mesophase Formation in Amphiphilic Therapeutic Delivery Systems

    No full text
    Dispersed amphiphile-fatty acid systems are of great interest in drug delivery and gene therapies because of their potential for triggered release of their payload. The mesophase behavior of these systems is extremely complex and is affected by environmental factors such as drug loading, percentage and nature of incorporated fatty acids, temperature, pH, and so forth. It is important to study phase behavior of amphiphilic materials as the mesophases directly influence the release rate of the incorporated drugs. We describe a robust machine learning method for predicting the phase behavior of these systems. We have developed models for each mesophase that simultaneous and reliably model the effects of amphiphile and fatty acid structure, concentration, and temperature and that make accurate predictions of these mesophases for conditions not used to train the models

    Quantitative Structure–Property Relationship Modeling of Diverse Materials Properties

    No full text
    Quantitative Structure–Property Relationship Modeling of Diverse Materials Propertie

    Predicting the Complex Phase Behavior of Self-Assembling Drug Delivery Nanoparticles

    No full text
    Amphiphilic lyotropic liquid crystalline self-assembled nanomaterials have important applications in the delivery of therapeutic and imaging agents. However, little is known about the effect of the incorporated drug on the structure of nanoparticles. Predicting these properties is widely considered intractable. We present computational models for three drug delivery carriers, loaded with 10 drugs at six concentrations and two temperatures. These models predicted phase behavior for 11 new drugs. Subsequent synchrotron small-angle X-ray scattering experiments validated the predictions

    Competitive Inhibition Mechanism of Acetylcholinesterase without Catalytic Active Site Interaction: Study on Functionalized C<sub>60</sub> Nanoparticles via in Vitro and in Silico Assays

    No full text
    Acetylcholinesterase (AChE) activity regulation by chemical agents or, potentially, nanomaterials is important for both toxicology and pharmacology. Competitive inhibition via direct catalytic active sites (CAS) binding or noncompetitive inhibition through interference with substrate and product entering and exiting has been recognized previously as an AChE-inhibition mechanism for bespoke nanomaterials. The competitive inhibition by peripheral anionic site (PAS) interaction without CAS binding remains unexplored. Here, we proposed and verified the occurrence of a presumed competitive inhibition of AChE without CAS binding for hydrophobically functionalized C<sub>60</sub> nanoparticles (NPs) by employing both experimental and computational methods. The kinetic inhibition analysis distinguished six competitive inhibitors, probably targeting the PAS, from the pristine and hydrophilically modified C<sub>60</sub> NPs. A simple quantitative nanostructure–activity relationship (QNAR) model relating the pocket accessible length of substituent to inhibition capacity was then established to reveal how the geometry of the surface group decides the NP difference in AChE inhibition. Molecular docking identified the PAS as the potential binding site interacting with the NPs via a T-shaped plug-in mode. Specifically, the fullerene core covered the enzyme gorge as a lid through π–π stacking with Tyr72 and Trp286 in the PAS, while the hydrophobic ligands on the fullerene surface inserted into the AChE active site to provide further stability for the complexes. The modeling predicted that inhibition would be severely compromised by Tyr72 and Trp286 deletions, and the subsequent site-directed mutagenesis experiments proved this prediction. Our results demonstrate AChE competitive inhibition of NPs without CAS participation to gain further understanding of both the neurotoxicity and the curative effect of NPs

    Accurate and interpretable nanoSAR models from genetic programming-based decision tree construction approaches

    No full text
    <p>The number of engineered nanomaterials (ENMs) being exploited commercially is growing rapidly, due to the novel properties they exhibit. Clearly, it is important to understand and minimize any risks to health or the environment posed by the presence of ENMs. Data-driven models that decode the relationships between the biological activities of ENMs and their physicochemical characteristics provide an attractive means of maximizing the value of scarce and expensive experimental data. Although such structure–activity relationship (SAR) methods have become very useful tools for modelling nanotoxicity endpoints (nanoSAR), they have limited robustness and predictivity and, most importantly, interpretation of the models they generate is often very difficult. New computational modelling tools or new ways of using existing tools are required to model the relatively sparse and sometimes lower quality data on the biological effects of ENMs. The most commonly used SAR modelling methods work best with large datasets, are not particularly good at feature selection, can be relatively opaque to interpretation, and may not account for nonlinearity in the structure–property relationships. To overcome these limitations, we describe the application of a novel algorithm, a genetic programming-based decision tree construction tool (GPTree) to nanoSAR modelling. We demonstrate the use of GPTree in the construction of accurate and <i>interpretable</i> nanoSAR models by applying it to four diverse literature datasets. We describe the algorithm and compare model results across the four studies. We show that GPTree generates models with accuracies equivalent to or superior to those of prior modelling studies on the same datasets. GPTree is a robust, automatic method for generation of accurate nanoSAR models with important advantages that it works with small datasets, automatically selects descriptors, and provides significantly improved interpretability of models.</p

    Predicting the Effect of Lipid Structure on Mesophase Formation during in Meso Crystallization

    No full text
    Bicontinuous cubic lipidic materials are increasingly used as crystallization media for in meso crystallization of membrane proteins (MPs). Varying the lipid architecture may assist with encapsulation of larger proteins and promote crystal growth. However, not all lipids are compatible with the components of typical crystallization screens, and compatibility must therefore be checked prior to crystallization trials. The method currently used, high-throughput small-angle X-ray scattering (HT SAXS), may be time-consuming and is costly in valuable MP. We have therefore employed a modeling approach using Bayesian regularized neural networks to accurately predict the complex phase behavior of lipid materials under the influence of the PACT crystallization screen and determine the lipid characteristics that allow a lipid to retain a cubic phase under the multiple components required during an in meso crystallization trial. This information will be used to select robust lipids for use in crystallization trials and may allow for the rational design of new lipids, specifically for in meso crystallization
    corecore