2 research outputs found

    Transferability in Machine Learning for Electronic Structure via the Molecular Orbital Basis

    No full text
    We present a machine learning (ML) method for predicting electronic structure correlation energies using Hartree–Fock input. The total correlation energy is expressed in terms of individual and pair contributions from occupied molecular orbitals, and Gaussian process regression is used to predict these contributions from a feature set that is based on molecular orbital properties, such as Fock, Coulomb, and exchange matrix elements. With the aim of maximizing transferability across chemical systems and compactness of the feature set, we avoid the usual specification of ML features in terms of atom- or geometry-specific information, such atom/element-types, bond-types, or local molecular structure. ML predictions of MP2 and CCSD energies are presented for a range of systems, demonstrating that the method maintains accuracy while providing transferability both within and across chemical families; this includes predictions for molecules with atom-types and elements that are not included in the training set. The method holds promise both in its current form and as a proof-of-principle for the use of ML in the design of generalized density-matrix functionals

    Chemical Interactions of Polyethylene Glycols (PEGs) and Glycerol with Protein Functional Groups: Applications to Effects of PEG and Glycerol on Protein Processes

    No full text
    In this work, we obtain the data needed to predict chemical interactions of polyethylene glycols (PEGs) and glycerol with proteins and related organic compounds and thereby interpret or predict chemical effects of PEGs on protein processes. To accomplish this, we determine interactions of glycerol and tetraEG with >30 model compounds displaying the major C, N, and O functional groups of proteins. Analysis of these data yields coefficients (α values) that quantify interactions of glycerol, tetraEG, and PEG end (-CH<sub>2</sub>OH) and interior (-CH<sub>2</sub>OCH<sub>2</sub>-) groups with these groups, relative to interactions with water. TetraEG (strongly) and glycerol (weakly) interact favorably with aromatic C, amide N, and cationic N, but unfavorably with amide O, carboxylate O, and salt ions. Strongly unfavorable O and salt anion interactions help make both small and large PEGs effective protein precipitants. Interactions of tetraEG and PEG interior groups with aliphatic C are quite favorable, while interactions of glycerol and PEG end groups with aliphatic C are not. Hence, tetraEG and PEG300 favor unfolding of the DNA-binding domain of lac repressor (lacDBD), while glycerol and di- and monoethylene glycol are stabilizers. Favorable interactions with aromatic and aliphatic C explain why PEG400 greatly increases the solubility of aromatic hydrocarbons and steroids. PEG400–steroid interactions are unusually favorable, presumably because of simultaneous interactions of multiple PEG interior groups with the fused ring system of the steroid. Using α values reported here, chemical contributions to PEG <i>m</i>-values can be predicted or interpreted in terms of changes in water-accessible surface area (ΔASA) and separated from excluded volume effects