7,351 research outputs found
Transferable atomic multipole machine learning models for small organic molecules
Accurate representation of the molecular electrostatic potential, which is
often expanded in distributed multipole moments, is crucial for an efficient
evaluation of intermolecular interactions. Here we introduce a machine learning
model for multipole coefficients of atom types H, C, O, N, S, F, and Cl in any
molecular conformation. The model is trained on quantum chemical results for
atoms in varying chemical environments drawn from thousands of organic
molecules. Multipoles in systems with neutral, cationic, and anionic molecular
charge states are treated with individual models. The models' predictive
accuracy and applicability are illustrated by evaluating intermolecular
interaction energies of nearly 1,000 dimers and the cohesive energy of the
benzene crystal.Comment: 11 pages, 6 figure
Non-covalent interactions across organic and biological subsets of chemical space: Physics-based potentials parametrized from machine learning
Classical intermolecular potentials typically require an extensive
parametrization procedure for any new compound considered. To do away with
prior parametrization, we propose a combination of physics-based potentials
with machine learning (ML), coined IPML, which is transferable across small
neutral organic and biologically-relevant molecules. ML models provide
on-the-fly predictions for environment-dependent local atomic properties:
electrostatic multipole coefficients (significant error reduction compared to
previously reported), the population and decay rate of valence atomic
densities, and polarizabilities across conformations and chemical compositions
of H, C, N, and O atoms. These parameters enable accurate calculations of
intermolecular contributions---electrostatics, charge penetration, repulsion,
induction/polarization, and many-body dispersion. Unlike other potentials, this
model is transferable in its ability to handle new molecules and conformations
without explicit prior parametrization: All local atomic properties are
predicted from ML, leaving only eight global parameters---optimized once and
for all across compounds. We validate IPML on various gas-phase dimers at and
away from equilibrium separation, where we obtain mean absolute errors between
0.4 and 0.7 kcal/mol for several chemically and conformationally diverse
datasets representative of non-covalent interactions in biologically-relevant
molecules. We further focus on hydrogen-bonded complexes---essential but
challenging due to their directional nature---where datasets of DNA base pairs
and amino acids yield an extremely encouraging 1.4 kcal/mol error. Finally, and
as a first look, we consider IPML in denser systems: water clusters,
supramolecular host-guest complexes, and the benzene crystal.Comment: 15 pages, 9 figure
Extension of the B3LYP - Dispersion-Correcting Potential Approach to the Accurate Treatment of both Inter- and Intramolecular Interactions
We recently showed that dispersion-correcting potentials (DCPs),
atom-centered Gaussian-type functions developed for use with B3LYP (J. Phys.
Chem. Lett. 2012, 3, 1738-1744) greatly improved the ability of the underlying
functional to predict non-covalent interactions. However, the application of
B3LYP-DCP for the {\beta}-scission of the cumyloxyl radical led a calculated
barrier height that was over-estimated by ca. 8 kcal/mol. We show in the
present work that the source of this error arises from the previously developed
carbon atom DCPs, which erroneously alters the electron density in the C-C
covalent-bonding region. In this work, we present a new C-DCP with a form that
was expected to influence the electron density farther from the nucleus. Tests
of the new C-DCP, with previously published H-, N- and O-DCPs, with
B3LYP-DCP/6-31+G(2d,2p) on the S66, S22B, HSG-A, and HC12 databases of
non-covalently interacting dimers showed that it is one of the most accurate
methods available for treating intermolecular interactions, giving mean
absolute errors (MAEs) of 0.19, 0.27, 0.16, and 0.18 kcal/mol, respectively.
Additional testing on the S12L database of complexation systems gave an MAE of
2.6 kcal/mol, showing that the B3LYP-DCP/6-31+G(2d,2p) approach is one of the
best-performing and feasible methods for treating large systems dominated by
non-covalent interactions. Finally, we showed that C-C making/breaking
chemistry is well-predicted using the newly developed DCPs. In addition to
predicting a barrier height for the {\beta}-scission of the cumyloxyl radical
that is within 1.7 kcal/mol of the high-level value, application of
B3LYP-DCP/6-31+G(2d,2p) to 10 databases that include reaction barrier heights
and energies, isomerization energies and relative conformation energies gives
performance that is amongst the best of all available dispersion-corrected
density-functional theory approaches
Problems, successes and challenges for the application of dispersion-corrected density-functional theory combined with dispersion-based implicit solvent models to large-scale hydrophobic self-assembly and polymorphism
© 2015 Taylor & Francis. The recent advent of dispersion-corrected density-functional theory (DFT) methods allows for quantitative modelling of molecular self-assembly processes, and we consider what is required to develop applications to the formation of large self-assembled monolayers (SAMs) on hydrophobic surfaces from organic solution. Focus is on application of the D3 dispersion correction of Grimme combined with the solvent dispersion model of Floris, Tomasi and Pascual-Ahuir to simulate observed scanning-tunnelling microscopy (STM) images of various polymorphs of tetraalkylporphyrin SAMs on highly oriented pyrolytic graphite surfaces. The most significant problem is identified as the need to treat SAM structures that are incommensurate with those of the substrate, providing a challenge to the use of traditional periodic-imaging boundary techniques. Using nearby commensurate lattices introduces non-systematic errors into calculated lattice constants and free energies of SAM formation that are larger than experimental uncertainties and polymorph differences. Developing non-periodic methods for polymorph interface simulation also remains a challenge. Despite these problems, existing methods can be used to interpret STM images and SAM atomic structures, distinguishing between multiple feasible polymorph types. They also provide critical insight into the factors controlling polymorphism. All this stems from a delicate balance that the intermolecular D3 and solvent Floris, Tomasi and Pascual-Ahuir corrections provide. Combined optimised treatments should yield fully quantitative approaches in the future
- …