459 research outputs found
Discrete Fourier Transform Improves the Prediction of the Electronic Properties of Molecules in Quantum Machine Learning
High-throughput approximations of quantum mechanics calculations and
combinatorial experiments have been traditionally used to reduce the search
space of possible molecules, drugs and materials. However, the interplay of
structural and chemical degrees of freedom introduces enormous complexity,
which the current state-of-the-art tools are not yet designed to handle. The
availability of large molecular databases generated by quantum mechanics (QM)
computations using first principles open new venues for data science to
accelerate the discovery of new compounds. In recent years, models that combine
QM with machine learning (ML) known as QM/ML models have been successful at
delivering the accuracy of QM at the speed of ML. The goals are to develop a
framework that will accelerate the extraction of knowledge and to get insights
from quantitative process-structure-property-performance relationships hidden
in materials data via a better search of the chemical compound space, and to
infer new materials with targeted properties. In this study, we show that by
integrating well-known signal processing techniques such as discrete Fourier
transform in the QM/ML pipeline, the outcomes can be significantly improved in
some cases. We also show that the spectrogram of a molecule may represent an
interesting molecular visualization tool.Comment: 4 pages, 3 figures, 2 tables. Accepted to present at 32nd IEEE
Canadian Conference in Electrical Engineering and Computer Scienc
Prediction of the Atomization Energy of Molecules Using Coulomb Matrix and Atomic Composition in a Bayesian Regularized Neural Networks
Exact calculation of electronic properties of molecules is a fundamental step
for intelligent and rational compounds and materials design. The intrinsically
graph-like and non-vectorial nature of molecular data generates a unique and
challenging machine learning problem. In this paper we embrace a learning from
scratch approach where the quantum mechanical electronic properties of
molecules are predicted directly from the raw molecular geometry, similar to
some recent works. But, unlike these previous endeavors, our study suggests a
benefit from combining molecular geometry embedded in the Coulomb matrix with
the atomic composition of molecules. Using the new combined features in a
Bayesian regularized neural networks, our results improve well-known results
from the literature on the QM7 dataset from a mean absolute error of 3.51
kcal/mol down to 3.0 kcal/mol.Comment: Under review ICANN 201
Unified Representation of Molecules and Crystals for Machine Learning
Accurate simulations of atomistic systems from first principles are limited
by computational cost. In high-throughput settings, machine learning can
potentially reduce these costs significantly by accurately interpolating
between reference calculations. For this, kernel learning approaches crucially
require a single Hilbert space accommodating arbitrary atomistic systems. We
introduce a many-body tensor representation that is invariant to translations,
rotations and nuclear permutations of same elements, unique, differentiable,
can represent molecules and crystals, and is fast to compute. Empirical
evidence is presented for energy prediction errors below 1 kcal/mol for 7k
organic molecules and 5 meV/atom for 11k elpasolite crystals. Applicability is
demonstrated for phase diagrams of Pt-group/transition-metal binary systems.Comment: Revised version, minor changes throughou
Crystal Structure Representations for Machine Learning Models of Formation Energies
We introduce and evaluate a set of feature vector representations of crystal
structures for machine learning (ML) models of formation energies of solids. ML
models of atomization energies of organic molecules have been successful using
a Coulomb matrix representation of the molecule. We consider three ways to
generalize such representations to periodic systems: (i) a matrix where each
element is related to the Ewald sum of the electrostatic interaction between
two different atoms in the unit cell repeated over the lattice; (ii) an
extended Coulomb-like matrix that takes into account a number of neighboring
unit cells; and (iii) an Ansatz that mimics the periodicity and the basic
features of the elements in the Ewald sum matrix by using a sine function of
the crystal coordinates of the atoms. The representations are compared for a
Laplacian kernel with Manhattan norm, trained to reproduce formation energies
using a data set of 3938 crystal structures obtained from the Materials
Project. For training sets consisting of 3000 crystals, the generalization
error in predicting formation energies of new structures corresponds to (i)
0.49, (ii) 0.64, and (iii) 0.37 eV/atom for the respective representations
SchNet: A continuous-filter convolutional neural network for modeling quantum interactions
Deep learning has the potential to revolutionize quantum chemistry as it is
ideally suited to learn representations for structured data and speed up the
exploration of chemical space. While convolutional neural networks have proven
to be the first choice for images, audio and video data, the atoms in molecules
are not restricted to a grid. Instead, their precise locations contain
essential physical information, that would get lost if discretized. Thus, we
propose to use continuous-filter convolutional layers to be able to model local
correlations without requiring the data to lie on a grid. We apply those layers
in SchNet: a novel deep learning architecture modeling quantum interactions in
molecules. We obtain a joint model for the total energy and interatomic forces
that follows fundamental quantum-chemical principles. This includes
rotationally invariant energy predictions and a smooth, differentiable
potential energy surface. Our architecture achieves state-of-the-art
performance for benchmarks of equilibrium molecules and molecular dynamics
trajectories. Finally, we introduce a more challenging benchmark with chemical
and structural variations that suggests the path for further work
Representations of molecules and materials for interpolation of quantum-mechanical simulations via machine learning
Computational study of molecules and materials from first principles is a cornerstone of physics, chemistry and materials science, but limited by the cost of accurate and precise simulations. In settings involving many simulations, machine learning can reduce these costs, sometimes by orders of magnitude, by interpolating between reference simulations. This requires representations that describe any molecule or material and support interpolation. We review, discuss and benchmark state-of-the-art representations and relations between them, including smooth overlap of atomic positions, many-body tensor representation, and symmetry functions. For this, we use a unified mathematical framework based on many-body functions, group averaging and tensor products, and compare energy predictions for organic molecules, binary alloys and Al-Ga-In sesquioxides in numerical experiments controlled for data distribution, regression method and hyper-parameter optimization
- …