613 research outputs found
Alchemical and structural distribution based representation for improved QML
We introduce a representation of any atom in any chemical environment for the
generation of efficient quantum machine learning (QML) models of common
electronic ground-state properties. The representation is based on scaled
distribution functions explicitly accounting for elemental and structural
degrees of freedom. Resulting QML models afford very favorable learning curves
for properties of out-of-sample systems including organic molecules,
non-covalently bonded protein side-chains, (HO)-clusters, as well as
diverse crystals. The elemental components help to lower the learning curves,
and, through interpolation across the periodic table, even enable "alchemical
extrapolation" to covalent bonding between elements not part of training, as
evinced for single, double, and triple bonds among main-group elements
Operator quantum machine learning: Navigating the chemical space of response properties
The identification and use of structure property relationships lies at the
heart of the chemical sciences. Quantum mechanics forms the basis for the
unbiased virtual exploration of chemical compound space (CCS), imposing
substantial compute needs if chemical accuracy is to be reached. In order to
accelerate predictions of quantum properties without compromising accuracy, our
lab has been developing quantum machine learning (QML) based models which can
be applied throughout CCS. Here, we briefly explain, review, and discuss the
recently introduced operator formalism which substantially improves the data
efficiency for QML models of common response properties
Water on hexagonal boron nitride from diffusion Monte Carlo
Despite a recent flurry of experimental and simulation studies, an accurate
estimate of the interaction strength of water molecules with hexagonal boron
nitride is lacking. Here we report quantum Monte Carlo results for the
adsorption of a water monomer on a periodic hexagonal boron nitride sheet,
which yield a water monomer interaction energy of -84 +/- 5 meV. We use the
results to evaluate the performance of several widely used density functional
theory (DFT) exchange correlation functionals, and find that they all deviate
substantially. Differences in interaction energies between different adsorption
sites are however better reproduced by DFT
Exploring water adsorption on isoelectronically doped graphene using alchemical derivatives
The design and production of novel 2-dimensional materials has seen great
progress in the last decade, prompting further exploration of the chemistry of
such materials. Doping and hydrogenating graphene is an experimentally realised
method of changing its surface chemistry, but there is still a great deal to be
understood on how doping impacts on the adsorption of molecules. Developing
this understanding is key to unlocking the potential applications of these
materials. High throughput screening methods can provide particularly effective
ways to explore vast chemical compositions of materials. Here, alchemical
derivatives are used as a method to screen the dissociative adsorption energy
of water molecules on various BN doped topologies of hydrogenated graphene. The
predictions from alchemical derivatives are assessed by comparison to density
functional theory. This screening method is found to predict dissociative
adsorption energies that span a range of more than 2 eV, with a mean absolute
error eV. In addition, we show that the quality of such predictions can
be readily assessed by examination of the Kohn-Sham highest occupied molecular
orbital in the initial states. In this way, the root mean square error in the
dissociative adsorption energies of water is reduced by almost an order of
magnitude (down to eV) after filtering out poor predictions. The
findings point the way towards a reliable use of first order alchemical
derivatives for efficient screening procedures
Geometry Relaxation and Transition State Search throughout Chemical Compound Space with Quantum Machine Learning
We use energies and forces predicted within response operator based quantum
machine learning (OQML) to perform geometry optimization and transition state
search calculations with legacy optimizers. For randomly sampled initial
coordinates of small organic query molecules we report systematic improvement
of equilibrium and transition state geometry output as training set sizes
increase. Out-of-sample S2 reactant complexes and transition state
geometries have been predicted using the LBFGS and the QST2 algorithm with an
RMSD of 0.16 and 0.4 \r{A} -- after training on up to 200 reactant complexes
relaxations and transition state search trajectories from the QMrxn20 data-set,
respectively. For geometry optimizations, we have also considered relaxation
paths up to 5'500 constitutional isomers with sum formula CHO
from the QM9-database. Using the resulting OQML models with an LBFGS optimizer
reproduces the minimum geometry with an RMSD of 0.14~\r{A}. For converged
equilibrium and transition state geometries subsequent vibrational normal mode
frequency analysis indicates deviation from MP2 reference results by on average
14 and 26\,cm, respectively. While the numerical cost for OQML
predictions is negligible in comparison to DFT or MP2, the number of steps
until convergence is typically larger in either case. The success rate for
reaching convergence, however, improves systematically with training set size,
underscoring OQML's potential for universal applicability
FCHL revisited:Faster and more accurate quantum machine learning
We introduce the FCHL19 representation for atomic environments in molecules
or condensed-phase systems. Machine learning models based on FCHL19 are able to
yield predictions of atomic forces and energies of query compounds with
chemical accuracy on the scale of milliseconds. FCHL19 is a revision of our
previous work [Faber et al. 2018] where the representation is discretized and
the individual features are rigorously optimized using Monte Carlo
optimization. Combined with a Gaussian kernel function that incorporates
elemental screening, chemical accuracy is reached for energy learning on the
QM7b and QM9 datasets after training for minutes and hours, respectively. The
model also shows good performance for non-bonded interactions in the condensed
phase for a set of water clusters with an MAE binding energy error of less than
0.1 kcal/mol/molecule after training on 3,200 samples. For force learning on
the MD17 dataset, our optimized model similarly displays state-of-the-art
accuracy with a regressor based on Gaussian process regression. When the
revised FCHL19 representation is combined with the operator quantum machine
learning regressor, forces and energies can be predicted in only a few
milliseconds per atom. The model presented herein is fast and lightweight
enough for use in general chemistry problems as well as molecular dynamics
simulations
ML Models of Vibrating HCO: Comparing Reproducing Kernels, FCHL and PhysNet
Machine Learning (ML) has become a promising tool for improving the quality
of atomistic simulations. Using formaldehyde as a benchmark system for
intramolecular interactions, a comparative assessment of ML models based on
state-of-the-art variants of deep neural networks (NN), reproducing kernel
Hilbert space (RKHS+F), and kernel ridge regression (KRR) is presented.
Learning curves for energies and atomic forces indicate rapid convergence
towards excellent predictions for B3LYP, MP2, and CCSD(T)-F12 reference results
for modestly sized (in the hundreds) training sets. Typically, learning curve
off-sets decay as one goes from NN (PhysNet) to RKHS+F to KRR (FCHL).
Conversely, the predictive power for extrapolation of energies towards new
geometries increases in the same order with RKHS+F and FCHL performing almost
equally. For harmonic vibrational frequencies, the picture is less clear, with
PhysNet and FCHL yielding respectively flat learning at 1 and 0.2
cm no matter which reference method, while RKHS+F models level off for
B3LYP, and exhibit continued improvements for MP2 and CCSD(T)-F12.
Finite-temperature molecular dynamics (MD) simulations with the same initial
conditions yield indistinguishable infrared spectra with good performance
compared with experiment except for the high-frequency modes involving hydrogen
stretch motion which is a known limitation of MD for vibrational spectroscopy.
For sufficiently large training set sizes all three models can detect
insufficient convergence (``noise'') of the reference electronic structure
calculations in that the learning curves level off. Transfer learning (TL) from
B3LYP to CCSD(T)-F12 with PhysNet indicates that additional improvements in
data efficiency can be achieved
Constant Size Molecular Descriptors For Use With Machine Learning
A set of molecular descriptors whose length is independent of molecular size
is developed for machine learning models that target thermodynamic and
electronic properties of molecules. These features are evaluated by monitoring
performance of kernel ridge regression models on well-studied data sets of
small organic molecules. The features include connectivity counts, which
require only the bonding pattern of the molecule, and encoded distances, which
summarize distances between both bonded and non-bonded atoms and so require the
full molecular geometry. In addition to having constant size, these features
summarize information regarding the local environment of atoms and bonds, such
that models can take advantage of similarities resulting from the presence of
similar chemical fragments across molecules. Combining these two types of
features leads to models whose performance is comparable to or better than the
current state of the art. The features introduced here have the advantage of
leading to models that may be trained on smaller molecules and then used
successfully on larger molecules.Comment: 18 pages, 5 figure
Tuning dissociation using isoelectronically doped graphene and hexagonal boron nitride: water and other small molecules
Novel uses for 2-dimensional materials like graphene and hexagonal boron
nitride (h-BN) are being frequently discovered especially for membrane and
catalysis applications. Still however, a great deal remains to be understood
about the interaction of environmentally and industrially elevant molecules
such as water with these materials. Taking inspiration from advances in
hybridising graphene and h-BN, we explore using density functional theory, the
dissociation of water, hydrogen, methane, and methanol on graphene, h-BN, and
their isoelectronic doped counterparts: BN doped graphene and C doped h-BN. We
find that doped surfaces are considerably more reactive than their pristine
counterparts and by comparing the reactivity of several small molecules we
develop a general framework for dissociative adsorption. From this a
particularly attractive consequence of isoelectronic doping emerges: substrates
can be doped to enhance their reactivity specifically towards either polar or
non-polar adsorbates. As such, these substrates are potentially viable
candidates for selective catalysts and membranes, with the implication that a
range of tuneable materials can be designed
- …