465 research outputs found

    Tailoring complexity for catalyst discovery using physically motivated machine learning

    Get PDF

    Big-Data Science in Porous Materials: Materials Genomics and Machine Learning

    Full text link
    By combining metal nodes with organic linkers we can potentially synthesize millions of possible metal organic frameworks (MOFs). At present, we have libraries of over ten thousand synthesized materials and millions of in-silico predicted materials. The fact that we have so many materials opens many exciting avenues to tailor make a material that is optimal for a given application. However, from an experimental and computational point of view we simply have too many materials to screen using brute-force techniques. In this review, we show that having so many materials allows us to use big-data methods as a powerful technique to study these materials and to discover complex correlations. The first part of the review gives an introduction to the principles of big-data science. We emphasize the importance of data collection, methods to augment small data sets, how to select appropriate training sets. An important part of this review are the different approaches that are used to represent these materials in feature space. The review also includes a general overview of the different ML techniques, but as most applications in porous materials use supervised ML our review is focused on the different approaches for supervised ML. In particular, we review the different method to optimize the ML process and how to quantify the performance of the different methods. In the second part, we review how the different approaches of ML have been applied to porous materials. In particular, we discuss applications in the field of gas storage and separation, the stability of these materials, their electronic properties, and their synthesis. The range of topics illustrates the large variety of topics that can be studied with big-data science. Given the increasing interest of the scientific community in ML, we expect this list to rapidly expand in the coming years.Comment: Editorial changes (typos fixed, minor adjustments to figures

    Development of predictive models for catalyst development

    Get PDF
    Abstract. This work was done as a part of the BioSPRINT project, which aims to improve biorefinery operations through process intensification and to replace fossil-based polymers with new bio-based products. The goal was to identify machine learned (ML) models that will accelerate the catalyst identification with high-throughput (HTP) screening methods, identify non-obvious formulations and allow catalyst tuning for different feedstock compositions. Maximum activity for conversion of complex sugar mixtures with optimal selectivity towards the key products of interest is desired. In the literature part of the thesis, ML was studied in general, where the focus was on different variable selection methods and modeling techniques, more specifically on data-driven modeling. Furthermore, modeling in catalysis was discussed with focus on ML in catalysis. Catalyst screening and selection, descriptor modeling and selection, and predictive modeling in catalysis were studied. In the experimental part, focus was on developing ML models that predict catalyst performance with relevant descriptors. Dataset for hydrogenation of 5-ethoxymethylfurfural with simple bimetal catalysts, including main metals and promoters, was used as ML model input with the addition of catalyst descriptors found in the literature. Four different responses were used in the experiments: selectivity and conversion with two different solvents. Methods used in the experimental part were discussed in detail, where data collection, preprocessing, variable selection, modeling and model validation were considered. Reference models without variable selection were first identified. Secondly, regularization algorithms were used to identify models. Finally, models with variable subsets obtained with regularization algorithms were identified. The effect of cross-validation was also studied. In general, good modeling results were obtained with boosted ensemble tree methods, support vector machine (SVM) methods and Gaussian process regression (GPR) methods. Lasso regression turned out to be the best variable selection method. Good results were obtained with the descriptors found in the literature. It was also shown, that fairly good results can be obtained with only two variables in the studied case. Promoter variables were not considered nearly as important as main metals with variable selection algorithms. Even though the modeling results were good, the variable selection methods were almost purely data-driven, and the actual relevance of the variables cannot be guaranteed. In the future work, optimization should be studied with the goal of finding catalysts that maximize catalyst performance values based on the model predictions. Also, extrapolation capabilities of the models need to be studied and improved. The studied methods can be easily implemented to other datasets. In the BioSPRINT project, experimental results related to the dehydration reaction of C5 and C6 sugars with simple metal catalysts will be obtained and used with the studied methods.Ennustavien mallien laatiminen katalyytin valmistuksen tehostamiseksi. Tiivistelmä. Tämä työ tehtiin osana BioSPRINT-projektia, jonka tavoitteena on kehittää biojalostamoiden toimintaa parantamalla niiden prosessitehokkuutta ja korvata fossiilipohjaiset polymeerit uusilla biopohjaisilla tuotteilla. Työn tavoitteena oli muodostaa koneoppimista hyödyntämällä mallit, jotka nopeuttavat optimaalisten katalyyttien löytämistä tehoseulonnan (high-throughput (HTP) screening) avulla, auttavat identifioimaan vaikeasti löydettäviä katalyyttiyhdistelmiä ja mahdollistavat katalyytin valinnan eri lähtöainekoostumuksilla. Tavoitteena on maksimoida monimutkaisten sokeriyhdisteiden konversio ja selektiivisyys halutuiksi tuotteiksi. Työn kirjallisuusosiossa perehdyttiin koneoppimiseen yleisellä tasolla, missä pääpaino oli muuttujanvalintamenetelmissä ja datapohjaisissa mallinnusmenetelmissä. Lisäksi kirjallisuusosassa tutkittiin mallinnuksen käyttöä katalyysissä, missä pääpaino oli koneoppimisen käytössä. Työssä tarkasteltiin myös katalyyttien seulontaa ja valintaa, laskennallisten muuttujien (deskriptorien) määrittelyä ja valintaa, sekä ennustavan mallinnuksen käyttöä katalyysissä. Kokeellisessa osiossa painopiste oli koneoppimista hyödyntävien mallien muodostuksessa, jotka ennustavat katalyyttien suorituskykyä oleellisilla deskriptoreilla. Data-aineistona käytettiin 5-etoksimetyylifurfuraalin hydrausreaktion tuloksia yksinkertaisilla kaksikomponenttisilla metallikatalyyteillä, jotka sisältävät päämetallin ja promoottorin. Data-aineistoa täydennettiin kirjallisuudesta löytyvillä katalyyttien deskriptoreilla ja käytettiin koneoppimista hyödyntävien mallien sisääntulona. Tutkimuksissa käytettiin neljää eri vastemuuttujaa: selektiivisyyttä ja konversiota kahdella eri liuottimella. Kokeellisessa osiossa käytetyt menetelmät käytiin läpi perusteellisesti huomioon ottaen data-aineiston keräämisen, esikäsittelyn, muuttujanvalinnan, mallinnuksen ja mallin validoinnin. Ensin referenssimallit identifioitiin. Tämän jälkeen regularisaatioalgoritmeilla suoritettiin mallinnus. Lopuksi mallinnus suoritettiin käyttämällä muuttujajoukkoja, jotka oli valittu käyttäen regularisaatioalgoritmeja. Myös ristivalidoinnin vaikutusta tutkittiin. Yleisesti hyvät mallinnustulokset saavutettiin boosted ensemble tree -tekniikalla, tukivektorikoneella ja Gaussian process -regressiolla. Lasso-menetelmä todettiin parhaaksi muuttujanvalinta-algoritmiksi. Hyvät tulokset saavutettiin kirjallisuudesta löytyvien deskriptorien avulla. Tutkimuksissa todettiin myös, että hyvät mallinnustulokset voidaan saavuttaa kyseisessä tutkimustapauksessa jopa vain kahdella muuttujalla. Päämetalleja kuvaavien muuttujien merkitsevyys todettiin paljon suuremmaksi kuin promoottorien vastaavien muuttujien. Saatavia mallinnustuloksia tarkasteltaessa täytyy huomioida, että muuttujanvalinta oli melkein täysin datapohjainen eikä muuttujien varsinaista merkitsevyyttä voida taata. Jatkossa mallien ennustuksia voidaan hyödyntää optimoinnissa, jossa tavoitteena on etsiä katalyyttiyhdistelmä, joka maksimoi katalyyttien suorituskyvyn. Myös mallin ekstrapolointikykyä täytyy tutkia ja kehittää. Tutkittavat menetelmät ovat helposti sovellettavissa myös muille samantyylisille data-aineistoille. BioSPRINT-projektista saadaan tulevaisuudessa käytettäväksi viisi- ja kuusihiilisten sokerien dehydraatioon perustuva data-aineisto yksinkertaisilla metallikatalyyteillä, jota tullaan käyttämään jatkotutkimuksissa

    Understanding the Interaction of CO and O2 with MgO(001) and Supported Metal Atoms: Towards Single-Atom Catalysis

    Get PDF
    This thesis contributes to the fundamental understanding of the interactions of a single gold atom supported by a defective and defect-free MgO(001) surface in a mixed CO/O2 atmosphere. Using cluster models and point charge embedding within a density functional theory framework, the CO oxidation reaction for a single gold atom is simulated on differently charged oxygen vacancies of MgO(001) to rationalise its experimentally observed lack of catalytic activity. The results show, that only the F0 colour centre promotes the electron redistribution towards an adsorbed oxygen molecule and sufficiently weakens the oxygen bond, as required for a sustainable catalytic cycle. The moderate adsorption energy of the gold atom, however, cannot prevent the insertion of oxygen atoms into the vacancy, which remains after the formation of the first CO2 molecule. The surface becomes invariably repaired, which set the focus on the chemistry on a defect-free MgO(001) surface. To contribute towards the field of heterogeneous single-atom catalysis, various analysis tools are used to shed light on the binding situation of supported group 11 metal atoms to the defect-free substrate and both CO and O2 molecules. Cooperative effects are found to enhance the stability of CO upon co-adsorption with O2 for all three metal centres. The results gives further insights to the lack of catalytic activity with respect to the CO oxidation under thermal conditions as a competition between OC-O2 bond activation and surface diffusion leading to metal atom agglomeration. For the simulation of surface dynamics, an accurate description of the potential energy surface is achieved for CO on a defect-free MgO(001) surface by parametrizing a reactive bond order force field to a new set of ab initio data. Theoretical investigation of the non-reactive scattering of CO from the surface are done by performing quasi-classical scattering dynamic simulations. The scattering behaviour for several incidence energies and different initial ro-vibrational states of impinging CO is evaluated, which illustrates the role of surface atom motion on energy transfer processes. The analysis of time of flight spectra and scattering angle distributions reveals two different scattering channels, which become particularly noticeable at low incidence energies due to the weak interaction potential of CO with MgO(001). The scattering process is strongly influenced by the anisotropy of the potential energy surface for CO impinging in upwards and downwards alignment. Eventually, the observations are in agreement with the established Baule model especially for the distinct scattering features at low incident energies.Die Arbeit vertieft das Wissen über die Wechselwirkungen zwischen einzelnen Goldatomen auf defekthaltigen und defektfreien MgO(001)-Oberflächen in einer gemischten CO/O2 Atmosphäre. Mit Hilfe der Cluster-Einbettungs-Methode und der Dichtefunktionaltheorie wird die Oxidationsreaktion von CO auf einen einzelnen Goldatom simuliert, welches verschieden geladenen Sauerstoff-Fehlstellen der Oberfläche besetzt, um experimentelle Ergebnisse nachzuvollziehen. Es zeigt sich, dass nur die neutral geladenen F0-Fehlstellen durch Elektronenumlagerung in Richtung adsorbierten Sauerstoff-Molekülen in der Lage sind, die Sauerstoffbindung soweit zu schwächen, um eine katalytische Reaktion zu ermöglichen. Dennoch ist die moderate Bindungsenergie eines Goldatoms auf der Fehlstelle nicht ausreichend um die Einlagerung eines einzelnen Sauerstoffatoms zu verhindern, das nach der Bildung des ersten CO2 Moleküls auf der Oberfläche zurückbleibt. Dies führt zur unwiderruflichen Reparatur der Oberflächendefekte. Deswegen verschiebt sich der Fokus auf die Chemie der defektfreien MgO(001)-Oberfläche. Es werden verschiedene Analysemethoden verwendet, um die Bindungsverhältnisse der Metalle der 11. Gruppe mit CO als auch O2 zu verstehen und weitere Einblicke aud den Gebiet der heterogenen Einzelatom-Katalyse zu bekommen. Die gemeinsame Anlagerung von CO und O2 auf allen drei Metallzentren verstärkt die jeweilige Adsorptionsstärke durch kooperative Effekte. Das Ausbleiben einer katalytischen Oxidation von CO unter thermischen Bedingungen wird durch die Ergebnisse unterstützt, vor allem wegen des Widerspruchs, sowohl gleichzeitig einen Bindungsbruch zu ermöglichen, ohne dabei die Metallatome zu größeren Clustern zusammenzuführen. Für die Simulation von Oberflächenprozesse wurde eine präzise Beschreibung des Potentials von CO auf defektfreien MgO(001)-Oberflächen unter Einbezug reaktiver Kraftfelder entwickelt. Es sind quasi-klassische Streusimulationen von CO durchgeführt und dessen Streuverhalten bei verschiedenen Einschlagsenergien und Rotationsschwingungszuständen untersucht worden. Besonderes Augenmerk fällt dabei auf die Bewegungsmöglichkeit der Oberflächenatome. Die Spektren der Flugzeit und Verteilung der Streuwinkel deuten auf zwei verschiedene Streukanäle hin, welche sich vor allem bei schwachen Einschlagsenergien deutlich hervorheben. Dies ist im Einklang mit der schwachen Natur der Gas-Oberflächen-Wechselwirkung. Der Streuprozess hängt deutlich von der Orientierung des Kohlenmonoxids beim Einschlag ab, was an der Anisotropie der Potentialenergiefläche ersichtlich wird. Die Beobachtungen, vor allem bei kleinen Einschlagsenergien, stimmen mit den Vorhersagen des etablierten Baule-Modells überein

    Representations of Materials for Machine Learning

    Full text link
    High-throughput data generation methods and machine learning (ML) algorithms have given rise to a new era of computational materials science by learning relationships among composition, structure, and properties and by exploiting such relations for design. However, to build these connections, materials data must be translated into a numerical form, called a representation, that can be processed by a machine learning model. Datasets in materials science vary in format (ranging from images to spectra), size, and fidelity. Predictive models vary in scope and property of interests. Here, we review context-dependent strategies for constructing representations that enable the use of materials as inputs or outputs of machine learning models. Furthermore, we discuss how modern ML techniques can learn representations from data and transfer chemical and physical information between tasks. Finally, we outline high-impact questions that have not been fully resolved and thus, require further investigation.Comment: 20 pages, 5 figures, To Appear in Annual Review of Materials Research 5

    Conversion of Carbon Dioxide to Fuels using Dispersed Atomic-Size Catalysts

    Get PDF
    Record high CO2 emissions in the atmosphere and the need to find alternative energy sources to fossil fuels are major global challenges. Conversion of CO2 into useful fuels like methanol and methane can in principle tackle both these environment and energy concerns. One of the routes to convert CO2 into useful fuels is by using supported metal catalyst. Specifically, metal atoms or clusters (few atoms large in size) supported on oxide materials are promising catalysts. Experiments have successfully converted CO2 to products like methanol, using TiO2 supported Cu atoms or clusters. How this catalyst works and how CO2 conversion could be improved is an area of much research. We used a quantum mechanical tool called density functional theory (DFT) to obtain atomic and electronic level insights in the CO2 reduction processes on TiO2 supported metal atoms and clusters. We modeled small Cu clusters on TiO2 surface, which are experimentally synthesizable. Our results show that the interfacial sites in TiO2 supported Cu are able to activate CO2 into a bent configuration that can be further reduced. The Cu dimer was found to be the most reactive for CO2 activation but were unstable catalysts. Following Cu, we also identified other potential metal atoms that can activate CO2. Compared to expensive and rare elements like Pt, Au, and Ir, we found several early and mid transition metals to be potentially active catalysts for CO2 reduction. Because the supported metal atom or cluster is a reactive catalyst, under reaction conditions they tend to undergo aggregation and/or oxidation to form larger less active catalysts. We chose Co, Ni, and Cu group elements to study their catalyst stability under oxidizing reaction conditions. Based on the thermodynamics of Cu oxidation and kinetics of O2 dissociation, we found that TiO2 supported Cu atom or a larger Cu tetramer cluster were the likely species observed in experiments. Our work provides valuable atomic-level insights into improving the CO2 reduction activities and predicts potential catalysts for CO2 reduction to valuable fuels

    Multiscale Models Of Interfacial Mechanics In Low Dimensional Systems

    Get PDF
    Crucial thrusts in modern technology from electronic information processing to engineering cellular systems require manipulation and control of materials on smaller and smaller scales to succeed. A simple and successful way to break conventional material property limitations or design multifunctional devices is to interface two different materials together. At small length scales, the surface to bulk ratio of each component material increases, to the point that the interfacial physics can dominate the properties of the engineered system. Simultaneously, the combinatorial space of possible interfaces between materials and/or molecules is far too vast to explore by trial-and-error experimentation alone. Intuitive theoretical models can greatly improve our ability to navigate such large search spaces by providing insight on how two materials are likely to interact. The goal of this thesis is to develop predictive physical models which explain emergent phenomena at material interfaces across multiple length and time scales. A variety of state-of-the-art tools were applied to realize this goal, including analytical mathematics, quantum mechanical simulations, finite element methods, and deep neural networks. At the electron scale, a continuum model parametrized by first-principles simulations was employed to develop design criteria for confined quantum states in lateral heterostructures of two-dimensional materials. At the atomic scale, a chemo-mechanical model incorporating long-range electrostatics was developed to explain synthesizability trends in composite heterostructures of inorganic perovskites and organic molecules. A machine learning graph neural network model was developed and applied to predict the impact of general surface strains on the adsorption energy of small molecule intermediates on catalyst surfaces. Finally, at the microscale, a nonlinear kinetic model was developed to explain how cells acquire and retain memory of the mechanical properties of their surroundings across multiple timescales, which can lead to irreversible adaptation and differentiation. The methods and results presented in this thesis can improve our understanding of physical phenomena arising at interfaces and provide a blueprint for future applications of multiscale computational modeling to science and engineering problems
    corecore