36 research outputs found
Chemical Properties from Graph Neural Network-Predicted Electron Densities
According to density functional theory, any chemical property can be inferred
from the electron density, making it the most informative attribute of an
atomic structure. In this work, we demonstrate the use of established physical
methods to obtain important chemical properties from model-predicted electron
densities. We introduce graph neural network architectural choices that provide
physically relevant and useful electron density predictions. Despite not
training to predict atomic charges, the model is able to predict atomic charges
with an order of magnitude lower error than a sum of atomic charge densities.
Similarly, the model predicts dipole moments with half the error of the sum of
atomic charge densities method. We demonstrate that larger data sets lead to
more useful predictions in these tasks. These results pave the way for an
alternative path in atomistic machine learning, where data-driven approaches
and existing physical methods are used in tandem to obtain a variety of
chemical properties in an explainable and self-consistent manner
From Molecules to Materials: Pre-training Large Generalizable Models for Atomic Property Prediction
Foundation models have been transformational in machine learning fields such
as natural language processing and computer vision. Similar success in atomic
property prediction has been limited due to the challenges of training
effective models across multiple chemical domains. To address this, we
introduce Joint Multi-domain Pre-training (JMP), a supervised pre-training
strategy that simultaneously trains on multiple datasets from different
chemical domains, treating each dataset as a unique pre-training task within a
multi-task framework. Our combined training dataset consists of 120M
systems from OC20, OC22, ANI-1x, and Transition-1x. We evaluate performance and
generalization by fine-tuning over a diverse set of downstream tasks and
datasets including: QM9, rMD17, MatBench, QMOF, SPICE, and MD22. JMP
demonstrates an average improvement of 59% over training from scratch, and
matches or sets state-of-the-art on 34 out of 40 tasks. Our work highlights the
potential of pre-training strategies that utilize diverse data to advance
property prediction across chemical domains, especially for low-data tasks
AdsorbML: A Leap in Efficiency for Adsorption Energy Calculations using Generalizable Machine Learning Potentials
Computational catalysis is playing an increasingly significant role in the
design of catalysts across a wide range of applications. A common task for many
computational methods is the need to accurately compute the adsorption energy
for an adsorbate and a catalyst surface of interest. Traditionally, the
identification of low energy adsorbate-surface configurations relies on
heuristic methods and researcher intuition. As the desire to perform
high-throughput screening increases, it becomes challenging to use heuristics
and intuition alone. In this paper, we demonstrate machine learning potentials
can be leveraged to identify low energy adsorbate-surface configurations more
accurately and efficiently. Our algorithm provides a spectrum of trade-offs
between accuracy and efficiency, with one balanced option finding the lowest
energy configuration 87.36% of the time, while achieving a 2000x speedup in
computation. To standardize benchmarking, we introduce the Open Catalyst Dense
dataset containing nearly 1,000 diverse surfaces and 100,000 unique
configurations.Comment: 26 pages, 7 figures. Submitted to npj Computational Material
Machine-Learning Methods Enable Exhaustive Searches for Active Bimetallic Facets and Reveal Active Site Motifs for CO_2 Reduction
Bimetallic catalysts are promising for the most difficult thermal and electrochemical reactions, but modeling the many diverse active sites on polycrystalline samples is an open challenge. We present a general framework for addressing this complexity in a systematic and predictive fashion. Active sites for every stable low-index facet of a bimetallic crystal are enumerated and cataloged, yielding hundreds of possible active sites. The activity of these sites is explored in parallel using a neural-network-based surrogate model to share information between the many density functional theory (DFT) relaxations, resulting in activity estimates with an order of magnitude fewer explicit DFT calculations. Sites with interesting activity were found and provide targets for follow-up calculations. This process was applied to the electrochemical reduction of CO_2 on nickel gallium bimetallics and indicated that most facets had similar activity to Ni surfaces, but a few exposed Ni sites with a very favorable on-top CO configuration. This motif emerged naturally from the predictive modeling and represents a class of intermetallic CO_2 reduction catalysts. These sites rationalize recent experimental reports of nickel gallium activity and why previous materials screens missed this exciting material. Most importantly these methods suggest that bimetallic catalysts will be discovered by studying facet reactivity and diversity of active sites more systematically
Adapting OC20-trained EquiformerV2 Models for High-Entropy Materials
Computational high-throughput studies, especially in research on high-entropy
materials and catalysts, are hampered by high-dimensional composition spaces
and myriad structural microstates. They present bottlenecks to the conventional
use of density functional theory calculations, and consequently, the use of
machine-learned potentials is becoming increasingly prevalent in atomic
structure simulations. In this communication, we show the results of adjusting
and fine-tuning the pretrained EquiformerV2 model from the Open Catalyst
Project to infer adsorption energies of *OH and *O on the out-of-domain
high-entropy alloy Ag-Ir-Pd-Pt-Ru. By applying an energy filter based on the
local environment of the binding site the zero-shot inference is markedly
improved and through few-shot fine-tuning the model yields state-of-the-art
accuracy. It is also found that EquiformerV2, assuming the role of general
machine learning potential, is able to inform a smaller, more focused direct
inference model. This knowledge distillation setup boosts performance on
complex binding sites. Collectively, this shows that foundational knowledge
learned from ordered intermetallic structures, can be extrapolated to the
highly disordered structures of solid-solutions. With the vastly accelerated
computational throughput of these models, hitherto infeasible research in the
high-entropy material space is now readily accessible
Heterogeneous Catalysts in Grammar School
The discovery of new catalytically active materi-
als is one of the holy grails of computational chemistry as it
has the potential to accelerate the adoption of renewable energy sources and reduce the energy consumption of chemical industry. Indeed, heterogeneous catalysts are essential for the production of synthetic fuels and many commodity chemicals. Consequently, novel catalysts with higher activity and selectivity, increased sustainability and longevity, or improved prospects for rejuvenation and cyclability are needed for a diverse range of processes. Unfortunately, computational catalyst discovery is a
daunting task, among other reasons because it is often unclear whether a proposed material is stable or synthesizable. This perspective proposes a new approach to this challenge, namely the use of generative grammars. We outline how grammars can guide the search for stable catalysts in a large chemical space and sketch out several research directions that would make this technology applicable to real materials