8 research outputs found

    AdsorbML: A Leap in Efficiency for Adsorption Energy Calculations using Generalizable Machine Learning Potentials

    Full text link
    Computational catalysis is playing an increasingly significant role in the design of catalysts across a wide range of applications. A common task for many computational methods is the need to accurately compute the adsorption energy for an adsorbate and a catalyst surface of interest. Traditionally, the identification of low energy adsorbate-surface configurations relies on heuristic methods and researcher intuition. As the desire to perform high-throughput screening increases, it becomes challenging to use heuristics and intuition alone. In this paper, we demonstrate machine learning potentials can be leveraged to identify low energy adsorbate-surface configurations more accurately and efficiently. Our algorithm provides a spectrum of trade-offs between accuracy and efficiency, with one balanced option finding the lowest energy configuration 87.36% of the time, while achieving a 2000x speedup in computation. To standardize benchmarking, we introduce the Open Catalyst Dense dataset containing nearly 1,000 diverse surfaces and 100,000 unique configurations.Comment: 26 pages, 7 figures. Submitted to npj Computational Material

    Predicting Intermetallic Surface Energies with High-Throughput DFT and Convolutional Neural Networks

    No full text
    Surface energy of inorganic crystals is crucial in understanding experimentally-relevant surface properties and thus important in designing materials for many applications including catalysis. Predictive methods and datasets exist for surface energies of monometallic crystals but predicting these properties for bimetallic or more complicated surfaces is an open challenge. Here we present a workflow for predicting surface energies \textit{ab initio} using high-throughput DFT and a machine learning framework. We calculate the surface energy of 3,285 intermetallic alloys with combinations of 36 elements and 47 space groups. We used this high-throughput workflow to seed a database of surface energies, which we used to train a crystal graph convolutional neural network (CGCNN). The CGCNN model was able to predict surface energies with a mean absolute test error of 0.0082 eV/angstrom^2 and can qualitatively reproduce nanoparticle surface distributions (Wulff constructions). Our workflow provides quantitative insights into which surfaces are more stable and therefore more realistic. It allows us to down-select interesting candidates that we can study with robust theoretical and experimental methods for applications such as catalysts screening and nanomaterials synthesis

    Multi-fidelity Sequential Learning for Accelerated Materials Discovery

    No full text
    We introduce a new agent-based framework for materials discovery that combines multi-fidelity modeling and sequential learning to lower the number of expensive data acquisitions while maximizing discovery. We demonstrate the framework\u27s capability by simulating a materials discovery campaign using experimental and DFT band gap data. Using these simulations, we determine how different machine learning models and acquisition strategies influence the overall rate of discovery of materials per experiment. The framework demonstrates that including lower fidelity (DFT) data, whether as a-priori knowledge or using in-tandem acquisition, increases the discovery rate of materials suitable for solar photoabsorption. We also show that the performance of a given agent depends on data size, model selection, and acquisition strategy. As such, our framework provides a tool that enables materials scientists to test various acquisition and model hyperparameters to maximize the discovery rate of their own multi-fidelity sequential learning campaigns for materials discovery

    Open Challenges in Developing Generalizable Large Scale Machine Learning Models for Catalyst Discovery

    Full text link
    The development of machine learned potentials for catalyst discovery has predominantly been focused on very specific chemistries and material compositions. While effective in interpolating between available materials, these approaches struggle to generalize across chemical space. The recent curation of large-scale catalyst datasets has offered the opportunity to build a universal machine learning potential, spanning chemical and composition space. If accomplished, said potential could accelerate the catalyst discovery process across a variety of applications (CO2 reduction, NH3 production, etc.) without additional specialized training efforts that are currently required. The release of the Open Catalyst 2020 (OC20) has begun just that, pushing the heterogeneous catalysis and machine learning communities towards building more accurate and robust models. In this perspective, we discuss some of the challenges and findings of recent developments on OC20. We examine the performance of current models across different materials and adsorbates to identify notably underperforming subsets. We then discuss some of the modeling efforts surrounding energy-conservation, approaches to finding and evaluating the local minima, and augmentation of off-equilibrium data. To complement the community's ongoing developments, we end with an outlook to some of the important challenges that have yet to be thoroughly explored for large-scale catalyst discovery.Comment: submitted to ACS Catalysi

    Materials cartography: A forward-looking perspective on materials representation and devising better maps

    No full text
    Machine learning (ML) is gaining popularity as a tool for materials scientists to accelerate computation, automate data analysis, and predict materials properties. The representation of input material features is critical to the accuracy, interpretability, and generalizability of data-driven models for scientific research. In this Perspective, we discuss a few central challenges faced by ML practitioners in developing meaningful representations, including handling the complexity of real-world industry-relevant materials, combining theory and experimental data sources, and describing scientific phenomena across timescales and length scales. We present several promising directions for future research: devising representations of varied experimental conditions and observations, the need to find ways to integrate machine learning into laboratory practices, and making multi-scale informatics toolkits to bridge the gaps between atoms, materials, and devices
    corecore