97 research outputs found
Machine Learning, Quantum Mechanics, and Chemical Compound Space
We review recent studies dealing with the generation of machine learning
models of molecular and solid properties. The models are trained and validated
using standard quantum chemistry results obtained for organic molecules and
materials selected from chemical space at random
Improving the Accuracy of Density Functional Theory (DFT) Calculation for Homolysis Bond Dissociation Energies of Y-NO Bond: Generalized Regression Neural Network Based on Grey Relational Analysis and Principal Component Analysis
We propose a generalized regression neural network (GRNN) approach based on grey relational analysis (GRA) and principal component analysis (PCA) (GP-GRNN) to improve the accuracy of density functional theory (DFT) calculation for homolysis bond dissociation energies (BDE) of Y-NO bond. As a demonstration, this combined quantum chemistry calculation with the GP-GRNN approach has been applied to evaluate the homolysis BDE of 92 Y-NO organic molecules. The results show that the ull-descriptor GRNN without GRA and PCA (F-GRNN) and with GRA (G-GRNN) approaches reduce the root-mean-square (RMS) of the calculated homolysis BDE of 92 organic molecules from 5.31 to 0.49 and 0.39 kcal molâ1 for the B3LYP/6-31G (d) calculation. Then the newly developed GP-GRNN approach further reduces the RMS to 0.31 kcal molâ1. Thus, the GP-GRNN correction on top of B3LYP/6-31G (d) can improve the accuracy of calculating the homolysis BDE in quantum chemistry and can predict homolysis BDE which cannot be obtained experimentally
Representations of molecules and materials for interpolation of quantum-mechanical simulations via machine learning
Computational study of molecules and materials from first principles is a cornerstone of physics, chemistry and materials science, but limited by the cost of accurate and precise simulations. In settings involving many simulations, machine learning can reduce these costs, sometimes by orders of magnitude, by interpolating between reference simulations. This requires representations that describe any molecule or material and support interpolation. We review, discuss and benchmark state-of-the-art representations and relations between them, including smooth overlap of atomic positions, many-body tensor representation, and symmetry functions. For this, we use a unified mathematical framework based on many-body functions, group averaging and tensor products, and compare energy predictions for organic molecules, binary alloys and Al-Ga-In sesquioxides in numerical experiments controlled for data distribution, regression method and hyper-parameter optimization
Recommended from our members
Toward Fast and Reliable Potential Energy Surfaces for Metallic Pt Clusters by Hierarchical Delta Neural Networks.
Data-driven machine learning force fields (MLFs) are more and more popular in atomistic simulations and exploit machine learning methods to predict energies and forces for unknown structures based on the knowledge learned from an existing reference database. The latter usually comes from density functional theory calculations. One main drawback of MLFs is that physical laws are not incorporated in the machine learning models, and instead, MLFs are designed to be very flexible to simulate complex quantum chemistry potential energy surface (PES). In general, MLFs have poor transferability, and hence, a very large trainset is required to span all the target feature space to get a reliable MLF. This procedure becomes more troublesome when the PES is complicated, with a large number of degrees of freedom, in which building a large database is inevitable and very expensive, especially when accurate but costly exchange-correlation functionals have to be used. In this manuscript, we exploit a high-dimensional neural network potential (HDNNP) on Pt clusters of sizes from 6 to 20 as one example. Our standard level of energy calculation is DFT GGA (PBE) using a plane wave basis set. We introduce an approximate but fast level with the PBE functional and a minimal atomic orbital basis set, and then, a more accurate but expensive level, using a hybrid functional or nonlocal vdW functional and a plane wave basis set, is reliably predicted by learning the difference with HDNNP. The results show that such a differential approach (named ÎHDNNP) can deliver very accurate predictions (error <10 meV/atom) in reference to converged basis set energies as well as more accurate but expensive xc functionals. The overall speedup can be as large as 900 for a 20 atom Pt cluster. More importantly, ÎHDNNP shows much better transferability due to the intrinsic smoothness of the delta potential energy surface, and accordingly, one can use much smaller trainset data to obtain better accuracy than the conventional HDNNP. A multilayer ÎHDNNP is thus proposed to obtain very accurate predictions versus expensive nonlocal vdW functional calculations in which the required trainset is further reduced. The approach can be easily generalized to any other machine learning methods and opens a path to study the structure and dynamics of Pt clusters and nanoparticles
Unraveling the Reaction Mechanism of Industrial Processes in Zeolite Catalysis: a Quantum Chemical Approach
Even though acidic zeolites form a crucial catalyst for many petrochemical processes, much of their fundamental reactive behavior is only superficially understood. Most often, catalysts are proposed on an 'ad hoc' basis, without a detailed understanding of their functioning on an atomic scale. It can indeed be difficult to identify the elementary steps of complex reaction networks from a purely experimental basis. For these issues, quantum chemical molecular modeling techniques provide an excellent complementary tool to laboratory data. This relatively new field of research has seen an enormous surge in popularity, mainly because of the rapid increase in computer power and the development of sufficiently accurate theoretical methods, which together make it possible now to model complex industrial processes. In this thesis, we use these modeling techniques for a detailed study on elementary reaction steps in zeolite catalysis.This summary gives only a very short overview of the work, and the interested reader is referred to the more elaborate full text or, for even more detail, to the research articles on which it is based, which are also included at the end of each relevant chapter.
In a preparatory chapter, several general terms and methods used throughout the thesis are introduced. First, two fundamental characteristics of zeolites that are vital in industrial catalysis - the topologically induced shape selectivity and the isomorphic substitution leading to a Bronsted acid site - are briefly explained. Then, the practical aspects of quantum chemical modeling of zeolites are discussed, with special attention given to the model space approximations that are necessary for such extended systems. Chemical reactions need to be modeled by computationally very demanding quantum chemical methods if we are to describe the changes in electronic binding pattern appropriately. Different approximations are possible, with an increase in accuracy usually accompanied by an increase in computational cost. Since zeolites are extended materials with a large number of atoms, a complete and accurate quantum chemical description of the entire system is not only extraordinarily demanding but also, at the moment at least, simply not feasible. This issue has, however, led to the development recently of some advanced techniques that do allow an accurate description of at least the chemically active part of the system. Finally, since in this thesis the most important conclusions are based on rate coefficients, the basics of chemical kinetics are also introduced, describing the molecular-scale calculation of macroscopic quantities using transition state theory.
Subsequently one of the most intriguing substantive problems in heterogeneous catalysis is tackled: the reaction mechanism of the methanol-to-olefin process (MTO). First, a whole class of reaction mechanisms, the so-called direct mechanisms, are investigated, for which initial C-C coupling is taken to occur from C1 species only. Earlier theoretical studies tended to be fragmentary, typically investigating only a single reaction step rather than a complete pathway. Nevertheless, the existence of these individual reaction steps was often considered theoretical evidence for the direct proposal, even though no one had succeeded in defining a complete low-energy pathway. To resolve this complex issue, an extensive reaction scheme is presented in this thesis, including all the possible pathways and their constituent elementary reaction steps on a consistent basis. By combining the individual steps, it is demonstrated that the direct mechanism concept cannot explain the initial C-C coupling. Three bottlenecks are identified:
- the instability of ylide and carbene intermediates,
- the extremely slow conversion of a methane/formaldehyde mixture to
ethanol, and
- the excessively high energy barriers for concerted C-C coupling steps.
Any alternative proposal, like the up-and-coming 'hydrocarbon pool' hypothesis, needs to provide C-C coupling steps that circumvent these bottlenecks.
The hydrocarbon pool model states that organic species trapped in the zeolite pores serve as building platforms, to which C1 species can attach methyl groups. The methylated species subsequently undergoes specific rearrangements and/or additional methylation steps, to finally split off light olefins. The original molecule is then regenerated by additional methylation steps. This way, the highly activated steps of the direct mechanisms could be bypassed. In this thesis, the initiating methylation (and at the same time C-C coupling) step is investigated. The results shed new light on the role of the zeolite framework in this process, and also in how the organic species and the inorganic zeolite cooperate as a supramolecular catalyst. The supramolecular picture is extended here by the explicit inclusion of previously omitted aspects like transition state shape selectivity and electronic stabilization of vital cationic intermediates by the zeolite framework. We should definitely look beyond pure geometrical aspects since electronic embedding plays an equally important role.
Additional insight into the hydrocarbon pool hypothesis is, however, required for a guided optimization of the catalyst. A first step to catalyst improvement has already been made by investigating the effect that small organic groups built into the catalyst might have on the elementary reaction steps. Two such modifications - methylene and amine moieties that are iso-electronic with oxygen - are theoretically investigated here. The methylene moiety is one of the simplest organic groups that fits perfectly as a bridge between two silicon atoms to form the functional Si-CH2-Si group. Even though such mesoporous organosilicate materials have been successfully synthesized before, only recently has a research team been able to synthesize methylene-substituted alumino-silicate zeolites. They failed to explain the observed framework defects, though, like the presence of end-standing Si-CH3 groups. In this thesis the influence of the methylene moiety on fundamental adsorption properties is discussed for both neutral probe molecules and charge compensating cations. Additionally, we demonstrate how the combination of aluminum atoms (plus a Bronsted acid proton) with a methylene moiety will inevitably lead to protonation of the organic group and subsequent cleavage of the framework.
For similar amine-functionalized zeolites, this thesis also shows that protonation of the amine group will not necessarily lead to cleavage of the zeolite structure. Furthermore, Si-NH-Si moieties will provide additional basic sites, comparable to traditional Al-O-Si sites but not constrained to the aluminum tetrahedron. This enables more proton locations as well as the possibility of more favorable transition state geometries. This can result in a drastic reduction in energy barrier for those reactions which would otherwise have a highly strained transition state. Summarizing, we demonstrate how small organic modifications to the zeolite framework can have a considerable effect on the fundamental catalytic properties and MTO-related reactivity. However, neither methylene nor amine groups can be located on the aluminum tetrahedron without being automatically protonated, which in the case of methylene-modified zeolites even results in cleavage of the framework.
This thesis shows very clearly how theoretical modeling is capable of providing new insights into zeolite catalysis. The applications presented here are already located near the limits of what is currently feasible, considering computer power, method development and the current lack of insights into the possible supramolecular character of the system. The rapid evolution in this field of research, even within the timescale of this thesis, makes it as good as certain that further significant advances will soon be within reach, and the thesis closes with the identification of our high-priority research goals for the immediate future. Especially in identification of elementary reaction steps and optimization of the catalyst, there are still quite some challenges ahead
Development and application of statistical and quantum mechanical methods for modelling molecular ensembles
The development of new quantum chemical methods requires
extensive benchmarking to establish the accuracy and limitations
of a method. Current benchmarking practices in computational
chemistry use test sets that are subject to human biases and as
such can be fundamentally flawed. This work presents a thorough
benchmark of diffusion Monte Carlo
(DMC) for a range of systems and properties as well as a novel
method for developing new, unbiased test sets using multivariate
statistical techniques. Firstly, the hydrogen abstraction of
methanol is used as a test system to develop a more efficient
protocol that minimises the computational cost of DMC without
compromising accuracy. This protocol is then applied to three
test sets of reaction energies, including 43 radical
stabilisation energies, 14 Diels-Alder reactions and 76 barrier
heights of hydrogen and non-hydrogen transfer reactions. The
average mean absolute error for all three databases is just 0.9
kcal/mol.
The accuracy of the explicitly correlated trial wavefunction used
in DMC is demonstrated using the ionisation potentials and
electron affinities of first- and second-row atoms. A
multi-determinant trial wavefunction reduces the errors for
systems with strong multi-configuration character, as well as
for predominantly single-reference systems. It is shown that the
use of pseudopotentials in place of all-electron basis sets
slightly increases the error for these systems. DMC is then
tested with a set of eighteen challenging reactions.
Incorporating more determinants in the trial wavefunction reduced
the errors for most systems but results are highly dependent on
the active space used in the CISD wavefunction. The accuracy of
multi-determinant DMC for strongly multi-reference systems is
tested for the isomerisation of diazene. In this case no method
was capable of reducing the error of the strongly-correlated
rotational transition state.
Finally, an improved method for selecting test sets is presented
using multivariate statistical techniques. Bias-free test sets
are constructed by selecting archetypes and prototypes based on
numerical representations of molecules. Descriptors based on the
one-, two- and three-dimensional structures of a molecule are
tested. These new test sets are
then used to benchmark a number of methods
Pharmacophore derivation using discotech and comparison of semi-emperical, AB initio and density functional CoMFA studies for sigma 1 and sigma 2 receptor-ligands
This study describes the development of pharmacophore and CoMFA models for sigma receptor ligands. CoMFA studies were performed for 48 bioactive sigma 1 receptorligands using [H3 ](+) pentazocine as the radioligand, for 30 PCP derivatives for sigma 1 receptor-ligands using [3H](+)SK-F 10047 as the radioligand and for 24 bioactive sigma 2 receptor-ligands using the radioligand [H3](+)DTG in the presence of pentazocine. Distance Comparisons (DISCOtech) was used as the starting point for CoMFA studies. The conformers, derived by DISCOtech were optimized using AMi, or HF/3-21G* in Gaussian 98. The optimized geometries were aligned with the pharmacophore, derived using DISCOtech. Atomic charges were calculated using AMl, HF/3-21G*, B3LYP/3-21G*, MP2/3-21G* methods in Gaussian 98. The CoMFA Maps that were developed using Sybyl 6.9 were compared on steric and electrostatic field differences. With leaveone-out cross validation the numbers of optimal components were decided. Using these numbers of optimal components no cross validation was performed in a training set. After a test set, it was known that CoMFA models derived from HF/3-21G* optimized geometries were more reliable in predicting bioactivities than CoMFA models derived from AMi optimized geometries
- âŠ