89 research outputs found
Machine Learning, Quantum Mechanics, and Chemical Compound Space
We review recent studies dealing with the generation of machine learning
models of molecular and solid properties. The models are trained and validated
using standard quantum chemistry results obtained for organic molecules and
materials selected from chemical space at random
Improving the Accuracy of Density Functional Theory (DFT) Calculation for Homolysis Bond Dissociation Energies of Y-NO Bond: Generalized Regression Neural Network Based on Grey Relational Analysis and Principal Component Analysis
We propose a generalized regression neural network (GRNN) approach based on grey relational analysis (GRA) and principal component analysis (PCA) (GP-GRNN) to improve the accuracy of density functional theory (DFT) calculation for homolysis bond dissociation energies (BDE) of Y-NO bond. As a demonstration, this combined quantum chemistry calculation with the GP-GRNN approach has been applied to evaluate the homolysis BDE of 92 Y-NO organic molecules. The results show that the ull-descriptor GRNN without GRA and PCA (F-GRNN) and with GRA (G-GRNN) approaches reduce the root-mean-square (RMS) of the calculated homolysis BDE of 92 organic molecules from 5.31 to 0.49 and 0.39 kcal mol−1 for the B3LYP/6-31G (d) calculation. Then the newly developed GP-GRNN approach further reduces the RMS to 0.31 kcal mol−1. Thus, the GP-GRNN correction on top of B3LYP/6-31G (d) can improve the accuracy of calculating the homolysis BDE in quantum chemistry and can predict homolysis BDE which cannot be obtained experimentally
The X1 method for accurate and efficient prediction of heats of formation
We propose the X1 method which combines the density functional theory method with a neural network (NN) correction for an accurate yet efficient prediction of heats of formation. It calculates the final energy by using B3LYP/6-311+G(3df,2p) at the B3LYP/6-311+G(d,p) optimized geometry to obtain the B3LYP standard heats of formation at 298 K with the unscaled zero-point energy and thermal corrections at the latter basis set. The NN parameters cover 15 elements of H, Li, Be, B, C, N, O, F, Na, Mg, Al, Si, P, S, and Cl. The performance of X1 is close to the Gn theories, giving a mean absolute deviation of 1.43 kcal/mol for the G3/99 set of 223 molecules up to 10 nonhydrogen atoms and 1.48 kcal/mol for the X1/07 set of 393 molecules up to 32 nonhydrogen atoms
Extending the reliability and applicability of B3LYP
B3LYP is by far the most popular density functional in chemistry. Nevertheless, there is growing evidence, showing that B3LYP (1) degrades as the system becomes larger, (2) underestimates reaction barrier heights, (3) yields too low bond dissociation enthalpies, (4) gives improper isomer energy differences, and (5) fails to bind van der Waals systems, etc.NSFC [20525311, 20923004, 10774126, 20973138]; Ministry of Science and Technology [2007CB815206
Improving the B3LYP bond energies by using the X1 method
Recently, we proposed the X1 method which combines density functional theory method (B3LYP) with a neural network correction for an accurate yet efficient prediction of heats of formation [J. M. Wu and X. Xu, J. Chem. Phys. 127, 214105 (2007)]. In the present work, we examine the X1 performance to calculate bond energies. We use 32 radicals and 115 molecules to set up 142 bond dissociation reactions. For the total of 147 heats of formations and 142 bond energies, B3LYP leads to mean absolute deviations of 4.54 and 6.26 kcal/mol, respectively, while X1 reduces the corresponding errors to 1.41 and 2.45 kcal/mol. (C) 2008 American Institute of Physics. [DOI: 10.1063/1.2998231]NSFC [20525311, 20533030, 20423002, 10774126]; Ministry of Science and Technology [2007CB815206, 2004CB719902
MLatom 3: Platform for machine learning-enhanced computational chemistry simulations and workflows
Machine learning (ML) is increasingly becoming a common tool in computational
chemistry. At the same time, the rapid development of ML methods requires a
flexible software framework for designing custom workflows. MLatom 3 is a
program package designed to leverage the power of ML to enhance typical
computational chemistry simulations and to create complex workflows. This
open-source package provides plenty of choice to the users who can run
simulations with the command line options, input files, or with scripts using
MLatom as a Python package, both on their computers and on the online XACS
cloud computing at XACScloud.com. Computational chemists can calculate energies
and thermochemical properties, optimize geometries, run molecular and quantum
dynamics, and simulate (ro)vibrational, one-photon UV/vis absorption, and
two-photon absorption spectra with ML, quantum mechanical, and combined models.
The users can choose from an extensive library of methods containing
pre-trained ML models and quantum mechanical approximations such as AIQM1
approaching coupled-cluster accuracy. The developers can build their own models
using various ML algorithms. The great flexibility of MLatom is largely due to
the extensive use of the interfaces to many state-of-the-art software packages
and libraries
Host–guest interactions in framework materials:Insight from modeling
The performance of metal–organic and covalent organic framework materials in sought-after applications—capture, storage, and delivery of gases and molecules, and separation of their mixtures—heavily depends on the host–guest interactions established inside the pores of these materials. Computational modeling provides information about the structures of these host–guest complexes and the strength and nature of the interactions present at a level of detail and precision that is often unobtainable from experiment. In this Review, we summarize the key simulation techniques spanning from molecular dynamics and Monte Carlo methods to correlate ab initio approaches and energy, density, and wavefunction partitioning schemes. We provide illustrative literature examples of their uses in analyzing and designing organic framework hosts. We also describe modern approaches to the high-throughput screening of thousands of existing and hypothetical metal–organic frameworks (MOFs) and covalent organic frameworks (COFs) and emerging machine learning techniques for predicting their properties and performances. Finally, we discuss the key methodological challenges on the path toward computation-driven design and reliable prediction of high-performing MOF and COF adsorbents and catalysts and suggest possible solutions and future directions in this exciting field of computational materials science
Determinação de propriedades termodinâmicas de reações de esterificação de ácidos graxos a partir da modelagem molecular
The esterification reactions of fatty acids, in the context of biodiesel production, are a relevant
chemical system to be studied, given the very importance of biodiesel as an alternative,
renewable and low polluting fuel, and also a diesel fuel of efficiency comparable to petroleum
diesel. The thermodynamic study of these reactions is scarce in the literature. Moreover, the
thermochemical data of the most commonly molecules found in raw materials for biodiesel
production are equally scarce. Molecular modeling allows the accurate calculation of
thermochemical quantities of molecules in general; but there is a dichotomy between accuracy
and computational cost; the higher the accuracy, the great the cost. For molecules of many
atoms, like most fatty acids, accurate calculations require a great calculation time, and
sometimes the calculation is not even possible because of the lack of computer resources. In
order to overcome these difficulties, this work presents some semi-empirical techniques for
calculating enthalpies of formation in the gas phase for large molecules, such as fatty acids and
esters. These techniques consist of defining parametric models that fit well-known experimental
data of the proposed set of molecules, which are the fatty acids and the methyl esters formed
from the esterification reaction in methanol. These models aim to add corrections to the
enthalpies of formation calculated by the B3LYP/6-311+G(d,p) model, whose parameters, such
as the number of carbons, hydrogens and double bonds of the molecules are chosen to account
for the effect of increasing systematic errors of the B3LYP method when the size of the
simulated molecules is increased. The models were adjusted to available experimental data by
applying two different techniques: least squares method and neural networks. With the
corrected enthalpies of formation, we can calculate the Gibbs free energy and the equilibrium
constant of the reactions, to determine the information about the viability and the energetic
conditions required by such reactions. The proposed correction models decreased significantly
the deviation between the experimental data and the B3LYP calculated enthalpies of formation,
and the required precision of 1 kcal mol-1 was achieved. Thus, the application of these models
enabled the accurate calculation of enthalpy of formation with reasonable computational cost.
The neural network correction method made possible the calculation of enthalpy of formation
with higher precision than the least squares method correction method. The application of the
corrected values of enthalpy of formation enabled to verify the expected behavior of
esterification reactions for biodiesel production, which is the favoring of the reaction by the
temperature increase. In addition, for a better description of the esterification reaction, the SMD
solvation method with the M06-2X/cc-pVTZ model were used to simulate the reaction
condition in solution, with the methanol reagent as the solvent, which is usually used in excess
to conduct esterification reactions and to shift the equilibrium towards of ester formation. The
application of the proposed model showed that the esterification of acetic acid is not favored
by the temperature increase but it is favored by the methanol excess. In a general sense, the
molecular modeling proved to be an important tool, and in spite of the limitations of the
available computational resources, it provided, together with semi-empirical correction
techniques, reliable results regarding the studied systems.Tese (Doutorado)As reações de esterificação de ácidos graxos, no contexto da produção de ésteres de biodiesel,
são um relevante sistema químico a ser estudado, dada a própria importância do biodiesel como
combustível alternativo, renovável, pouco poluente e de eficiência comparável ao diesel de
petróleo. Os estudos destas reações, sob o ponto de vista termodinâmico, são escassos na
literatura. Os dados termoquímicos das moléculas mais comumente encontradas nas matériasprimas
de produção do biodiesel são igualmente escassos. A modelagem molecular permite que
se calcule com relativa precisão as grandezas termoquímicas de moléculas em geral; porém, há
sempre a dicotomia entre a precisão do cálculo realizado e do custo computacional que ele
exige; quanto maior a precisão, maior o custo. Para moléculas de muitos átomos, como a
maioria dos ácidos graxos, a precisão exige bastante tempo de cálculo, e às vezes o cálculo
sequer é possível pela falta de capacidade de processamento do computador. Visando contornar
estas dificuldades, este trabalho apresenta algumas técnicas semi-empíricas para o cálculo de
entalpias de formação na fase gasosa de ácidos graxos e ésteres. Estas técnicas consistem em
definir modelos paramétricos que se ajustem a dados experimentais conhecidos do conjunto de
moléculas estudados, que são os ácidos graxos e os ésteres metílicos formados a partir da reação
de esterificação em metanol. Estes modelos visam adicionar correções às entalpias de formação
calculadas pelo modelo B3LYP/6-311+G(d,p) pela inclusão de parâmetros, como o número de
carbonos, hidrogênios e duplas ligações das moléculas, que foram escolhidos para contabilizar
o efeito do aumento dos erros sistemáticos do método B3LYP quando se aumenta o tamanho
das moléculas simuladas. Os modelos foram ajustados aos dados experimentais disponíveis por
duas técnicas distintas: mínimos quadrados e redes neurais. A partir das entalpias de formação
corrigidas, calculamos a energia livre de Gibbs e a constante de equilíbrio das reações para
determinar as informações sobre a viabilidade e as condições energéticas requeridas por tais
reações. Os modelos de correção propostos, tanto baseados no método de mínimos quadrados
quanto em redes neurais, diminuíram significativamente o desvio entre o valor de entalpia de
formação experimental e calculado pelo método B3LYP, atingindo precisão da ordem de 1 kcal
mol-1. Assim, com a utilização destes modelos, é possível predizermos as entalpias de formação
das moléculas de interesse com elevada precisão e com custo computacional razoável. O
método de correção com redes neurais possibilitou o cálculo da entalpia de formação com
precisão superior ao método de correção com mínimos quadrados. A utilização destes dados
termodinâmicos corrigidos possibilitou verificar que a reação de esterificação de ácidos graxos
para produção de biodiesel é favorecida com o aumento da temperatura, sendo este o
comportamento observado experimentalmente na literatura. Além disso, para uma melhor
descrição da reação de esterificação, foi aplicado o método de solvatação SMD, em conjunto
com o modelo M06-2X/cc-pVTZ, para simular a condição da reação em solução, tendo como
solvente o reagente metanol, que usualmente é usado em excesso na condução das reações para
deslocar o equilíbrio no sentido da formação de ésteres. A aplicação do modelo proposto
mostrou que a reação de esterificação do ácido acético não é favorecida com o aumento da
temperatura, e que as constantes de equilíbrio, comparativamente maiores no caso da reação
em excesso de metanol, indicam que neste caso o equilíbrio da reação é mais deslocado para a
formação de produtos, como esperado. De forma geral, a modelagem molecular mostrou ser
uma ferramenta importante, e, apesar das limitações dos recursos computacionais disponíveis,
forneceu, em conjunto com as técnicas de correção semi-empíricas, resultados confiáveis a
respeito dos sistemas estudados
PHYSICOCHEMICAL, SPECTROSCOPIC PROPERTIES, AND DIFFUSION MECHANISMS OF SMALL HYDROCARBON MOLECULES IN MOF-74-MG/ZN: A QUANTUM CHEMICAL INVESTIGATION
In petroleum refining industries, the fracturing process allows for the cracking of long-chain hydrocarbons into a mixture of small olefin and paraffin molecules that are then separated via the energetically and monetarily demanding cryogenic distillation process. In an attempt to mitigate both energetic and capital consumptions, selective sorption of light hydrocarbons by tunable sorbents, such as metal-organic frameworks (MOFs), appears to be the most promising alternative for a more efficient gas separation process. MOFs are novel porous materials assembled from inorganic bricks connected by organic linkers. From a crystal engineering stand point, MOFs are advantageous in creating a range of microporous (0.2–2.0 nm) to mesoporous (\u3e50 nm) void cavities, presenting unique opportunities for the functionalization of both the organic linkers and the void. Of significant importance is the MOF-74-M family (M = metal), characterized by a high density of open metal sites, that is not fully coordinated metal centers. This family of MOF is also known as CPO-27-M. MOF-74 have demonstrated more separation potential than other known MOFs and zeolites. Density functional theory (DFT), as implemented within a linear combination of atomic orbital (LCAO) approach, has been used to investigate the selective sorption of C1-C4hydrocarbons in MOF-74-Mg/Zn. The study was first implemented by adopting a molecular cluster approach, and later by applying periodic boundary conditions (PBC). While both modellistic approaches agree in showing significant differences in binding energies between olefins and paraffins adsorbed at the MOFs’ open metal sites, results reported at the molecular cluster level show underestimation when compared to those obtained at the PBC level. The use of PBC models allow for the correcting of binding energies for basis set superposition error (BSSE), molecular lateral interaction (LI), zero-point energy (ZPE), and thermal energy (TE) contributions. As such, results obtained at the PBC level are directly comparable to experimental calorimetric values (i.e., heat of adsorptions). This work discusses, for the first time, the origin of the fictitious agreement between binding energies obtained with molecular clusters and experimental heats of adsorption, identifying its origin as due to compensation of errors. Spectroscopy studies based on the intensities and frequency shifts with respect to the molecules in the gas phase are presented as a further investigation of the interaction of the small hydrocarbons (C1-C2) with the open metal sites in MOF-74-Mg. In an attempt to provide a more comprehensive description of the behavior of the hydrocarbon molecules, results from diffusion mechanism studies are also presented. The investigations of the diffusion mechanisms are based on the use of climbing-image nudge elastic band (CI-NEB) simulations, coupled with van der Waals functional (vdW-DF) and ultra-soft pseudopotentials as implemented within the plane-wave (PW) DFT approach. The CI-NEB studies showed that paraffin molecules are more energetically favored to diffuse within and along the cavity of MOF-74-Mg with respect to their olefin counterparts
Recommended from our members
Toward Fast and Reliable Potential Energy Surfaces for Metallic Pt Clusters by Hierarchical Delta Neural Networks.
Data-driven machine learning force fields (MLFs) are more and more popular in atomistic simulations and exploit machine learning methods to predict energies and forces for unknown structures based on the knowledge learned from an existing reference database. The latter usually comes from density functional theory calculations. One main drawback of MLFs is that physical laws are not incorporated in the machine learning models, and instead, MLFs are designed to be very flexible to simulate complex quantum chemistry potential energy surface (PES). In general, MLFs have poor transferability, and hence, a very large trainset is required to span all the target feature space to get a reliable MLF. This procedure becomes more troublesome when the PES is complicated, with a large number of degrees of freedom, in which building a large database is inevitable and very expensive, especially when accurate but costly exchange-correlation functionals have to be used. In this manuscript, we exploit a high-dimensional neural network potential (HDNNP) on Pt clusters of sizes from 6 to 20 as one example. Our standard level of energy calculation is DFT GGA (PBE) using a plane wave basis set. We introduce an approximate but fast level with the PBE functional and a minimal atomic orbital basis set, and then, a more accurate but expensive level, using a hybrid functional or nonlocal vdW functional and a plane wave basis set, is reliably predicted by learning the difference with HDNNP. The results show that such a differential approach (named ΔHDNNP) can deliver very accurate predictions (error <10 meV/atom) in reference to converged basis set energies as well as more accurate but expensive xc functionals. The overall speedup can be as large as 900 for a 20 atom Pt cluster. More importantly, ΔHDNNP shows much better transferability due to the intrinsic smoothness of the delta potential energy surface, and accordingly, one can use much smaller trainset data to obtain better accuracy than the conventional HDNNP. A multilayer ΔHDNNP is thus proposed to obtain very accurate predictions versus expensive nonlocal vdW functional calculations in which the required trainset is further reduced. The approach can be easily generalized to any other machine learning methods and opens a path to study the structure and dynamics of Pt clusters and nanoparticles
- …