New methods for carbon dioxide removal are urgently needed to combat global
climate change. Direct air capture (DAC) is an emerging technology to capture
carbon dioxide directly from ambient air. Metal-organic frameworks (MOFs) have
been widely studied as potentially customizable adsorbents for DAC. However,
discovering promising MOF sorbents for DAC is challenging because of the vast
chemical space to explore and the need to understand materials as functions of
humidity and temperature. We explore a computational approach benefiting from
recent innovations in machine learning (ML) and present a dataset named Open
DAC 2023 (ODAC23) consisting of more than 38M density functional theory (DFT)
calculations on more than 8,400 MOF materials containing adsorbed CO2​ and/or
H2​O. ODAC23 is by far the largest dataset of MOF adsorption calculations at
the DFT level of accuracy currently available. In addition to probing
properties of adsorbed molecules, the dataset is a rich source of information
on structural relaxation of MOFs, which will be useful in many contexts beyond
specific applications for DAC. A large number of MOFs with promising properties
for DAC are identified directly in ODAC23. We also trained state-of-the-art ML
models on this dataset to approximate calculations at the DFT level. This
open-source dataset and our initial ML models will provide an important
baseline for future efforts to identify MOFs for a wide range of applications,
including DAC