968 research outputs found
Recommended from our members
Discovery of high-entropy ceramics via machine learning
AbstractAlthough high-entropy materials are attracting considerable interest due to a combination of useful properties and promising applications, predicting their formation remains a hindrance for rational discovery of new systems. Experimental approaches are based on physical intuition and/or expensive trial and error strategies. Most computational methods rely on the availability of sufficient experimental data and computational power. Machine learning (ML) applied to materials science can accelerate development and reduce costs. In this study, we propose an ML method, leveraging thermodynamic and compositional attributes of a given material for predicting the synthesizability (i.e., entropy-forming ability) of disordered metal carbides. The relative importance of the thermodynamic and compositional features for the predictions are then explored. The approach’s suitability is demonstrated by comparing values calculated with density functional theory to ML predictions. Finally, the model is employed to predict the entropy-forming ability of 70 new compositions; several predictions are validated by additional density functional theory calculations and experimental synthesis, corroborating the effectiveness in exploring vast compositional spaces in a high-throughput manner. Importantly, seven compositions are selected specifically, because they contain all three of the Group VI elements (Cr, Mo, and W), which do not form room temperature-stable rock-salt monocarbides. Incorporating the Group VI elements into the rock-salt structure provides further opportunity for tuning the electronic structure and potentially material performance
Development of predictive models for catalyst development
Abstract. This work was done as a part of the BioSPRINT project, which aims to improve biorefinery operations through process intensification and to replace fossil-based polymers with new bio-based products. The goal was to identify machine learned (ML) models that will accelerate the catalyst identification with high-throughput (HTP) screening methods, identify non-obvious formulations and allow catalyst tuning for different feedstock compositions. Maximum activity for conversion of complex sugar mixtures with optimal selectivity towards the key products of interest is desired.
In the literature part of the thesis, ML was studied in general, where the focus was on different variable selection methods and modeling techniques, more specifically on data-driven modeling. Furthermore, modeling in catalysis was discussed with focus on ML in catalysis. Catalyst screening and selection, descriptor modeling and selection, and predictive modeling in catalysis were studied.
In the experimental part, focus was on developing ML models that predict catalyst performance with relevant descriptors. Dataset for hydrogenation of 5-ethoxymethylfurfural with simple bimetal catalysts, including main metals and promoters, was used as ML model input with the addition of catalyst descriptors found in the literature. Four different responses were used in the experiments: selectivity and conversion with two different solvents. Methods used in the experimental part were discussed in detail, where data collection, preprocessing, variable selection, modeling and model validation were considered. Reference models without variable selection were first identified. Secondly, regularization algorithms were used to identify models. Finally, models with variable subsets obtained with regularization algorithms were identified. The effect of cross-validation was also studied.
In general, good modeling results were obtained with boosted ensemble tree methods, support vector machine (SVM) methods and Gaussian process regression (GPR) methods. Lasso regression turned out to be the best variable selection method. Good results were obtained with the descriptors found in the literature. It was also shown, that fairly good results can be obtained with only two variables in the studied case. Promoter variables were not considered nearly as important as main metals with variable selection algorithms. Even though the modeling results were good, the variable selection methods were almost purely data-driven, and the actual relevance of the variables cannot be guaranteed.
In the future work, optimization should be studied with the goal of finding catalysts that maximize catalyst performance values based on the model predictions. Also, extrapolation capabilities of the models need to be studied and improved. The studied methods can be easily implemented to other datasets. In the BioSPRINT project, experimental results related to the dehydration reaction of C5 and C6 sugars with simple metal catalysts will be obtained and used with the studied methods.Ennustavien mallien laatiminen katalyytin valmistuksen tehostamiseksi. Tiivistelmä. Tämä työ tehtiin osana BioSPRINT-projektia, jonka tavoitteena on kehittää biojalostamoiden toimintaa parantamalla niiden prosessitehokkuutta ja korvata fossiilipohjaiset polymeerit uusilla biopohjaisilla tuotteilla. Työn tavoitteena oli muodostaa koneoppimista hyödyntämällä mallit, jotka nopeuttavat optimaalisten katalyyttien löytämistä tehoseulonnan (high-throughput (HTP) screening) avulla, auttavat identifioimaan vaikeasti löydettäviä katalyyttiyhdistelmiä ja mahdollistavat katalyytin valinnan eri lähtöainekoostumuksilla. Tavoitteena on maksimoida monimutkaisten sokeriyhdisteiden konversio ja selektiivisyys halutuiksi tuotteiksi.
Työn kirjallisuusosiossa perehdyttiin koneoppimiseen yleisellä tasolla, missä pääpaino oli muuttujanvalintamenetelmissä ja datapohjaisissa mallinnusmenetelmissä. Lisäksi kirjallisuusosassa tutkittiin mallinnuksen käyttöä katalyysissä, missä pääpaino oli koneoppimisen käytössä. Työssä tarkasteltiin myös katalyyttien seulontaa ja valintaa, laskennallisten muuttujien (deskriptorien) määrittelyä ja valintaa, sekä ennustavan mallinnuksen käyttöä katalyysissä.
Kokeellisessa osiossa painopiste oli koneoppimista hyödyntävien mallien muodostuksessa, jotka ennustavat katalyyttien suorituskykyä oleellisilla deskriptoreilla. Data-aineistona käytettiin 5-etoksimetyylifurfuraalin hydrausreaktion tuloksia yksinkertaisilla kaksikomponenttisilla metallikatalyyteillä, jotka sisältävät päämetallin ja promoottorin. Data-aineistoa täydennettiin kirjallisuudesta löytyvillä katalyyttien deskriptoreilla ja käytettiin koneoppimista hyödyntävien mallien sisääntulona. Tutkimuksissa käytettiin neljää eri vastemuuttujaa: selektiivisyyttä ja konversiota kahdella eri liuottimella. Kokeellisessa osiossa käytetyt menetelmät käytiin läpi perusteellisesti huomioon ottaen data-aineiston keräämisen, esikäsittelyn, muuttujanvalinnan, mallinnuksen ja mallin validoinnin. Ensin referenssimallit identifioitiin. Tämän jälkeen regularisaatioalgoritmeilla suoritettiin mallinnus. Lopuksi mallinnus suoritettiin käyttämällä muuttujajoukkoja, jotka oli valittu käyttäen regularisaatioalgoritmeja. Myös ristivalidoinnin vaikutusta tutkittiin.
Yleisesti hyvät mallinnustulokset saavutettiin boosted ensemble tree -tekniikalla, tukivektorikoneella ja Gaussian process -regressiolla. Lasso-menetelmä todettiin parhaaksi muuttujanvalinta-algoritmiksi. Hyvät tulokset saavutettiin kirjallisuudesta löytyvien deskriptorien avulla. Tutkimuksissa todettiin myös, että hyvät mallinnustulokset voidaan saavuttaa kyseisessä tutkimustapauksessa jopa vain kahdella muuttujalla. Päämetalleja kuvaavien muuttujien merkitsevyys todettiin paljon suuremmaksi kuin promoottorien vastaavien muuttujien. Saatavia mallinnustuloksia tarkasteltaessa täytyy huomioida, että muuttujanvalinta oli melkein täysin datapohjainen eikä muuttujien varsinaista merkitsevyyttä voida taata.
Jatkossa mallien ennustuksia voidaan hyödyntää optimoinnissa, jossa tavoitteena on etsiä katalyyttiyhdistelmä, joka maksimoi katalyyttien suorituskyvyn. Myös mallin ekstrapolointikykyä täytyy tutkia ja kehittää. Tutkittavat menetelmät ovat helposti sovellettavissa myös muille samantyylisille data-aineistoille. BioSPRINT-projektista saadaan tulevaisuudessa käytettäväksi viisi- ja kuusihiilisten sokerien dehydraatioon perustuva data-aineisto yksinkertaisilla metallikatalyyteillä, jota tullaan käyttämään jatkotutkimuksissa
Machine learning modeling of superconducting critical temperature
Superconductivity has been the focus of enormous research effort since its
discovery more than a century ago. Yet, some features of this unique phenomenon
remain poorly understood; prime among these is the connection between
superconductivity and chemical/structural properties of materials. To bridge
the gap, several machine learning schemes are developed herein to model the
critical temperatures () of the 12,000+ known superconductors
available via the SuperCon database. Materials are first divided into two
classes based on their values, above and below 10 K, and a
classification model predicting this label is trained. The model uses
coarse-grained features based only on the chemical compositions. It shows
strong predictive power, with out-of-sample accuracy of about 92%. Separate
regression models are developed to predict the values of for
cuprate, iron-based, and "low-" compounds. These models also
demonstrate good performance, with learned predictors offering potential
insights into the mechanisms behind superconductivity in different families of
materials. To improve the accuracy and interpretability of these models, new
features are incorporated using materials data from the AFLOW Online
Repositories. Finally, the classification and regression models are combined
into a single integrated pipeline and employed to search the entire Inorganic
Crystallographic Structure Database (ICSD) for potential new superconductors.
We identify more than 30 non-cuprate and non-iron-based oxides as candidate
materials.Comment: 17 pages, 7 figure
Discovery of Materials Through Applied Machine Learning
Advances in artificial intelligence technology, specifically machine learning, have cre- ated opportunities in the material sciences to accelerate material discovery and gain fundamental understanding of the interaction between certain the constituent ele- ments of a material and the properties expressed by that material. Application of machine learning to experimental materials discovery is slow due to the monetary and temporal cost of experimental data, but parallel techniques such as continuous com- positional gradients or high-throughput characterization setups are capable of gener- ating larger amounts of data than the typical experimental process, and therefore are suitable for combination with machine learning. A random forest machine learning algorithm has been applied to two different materials discovery challenges - discovery of new metallic glass forming ternary compositions and discovery of novel ammonia decomposition catalysts - and has led to accelerated discovery of high-performing materials
Meta-Analysis of Vaterite Secondary Data Revealed the Synthesis Conditions for Polymorphic Control
Acknowledgements This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.Peer reviewedPostprin
Benchmarking Materials Property Prediction Methods: The Matbench Test Set and Automatminer Reference Algorithm
We present a benchmark test suite and an automated machine learning procedure
for evaluating supervised machine learning (ML) models for predicting
properties of inorganic bulk materials. The test suite, Matbench, is a set of
13 ML tasks that range in size from 312 to 132k samples and contain data from
10 density functional theory-derived and experimental sources. Tasks include
predicting optical, thermal, electronic, thermodynamic, tensile, and elastic
properties given a materials composition and/or crystal structure. The
reference algorithm, Automatminer, is a highly-extensible, fully-automated ML
pipeline for predicting materials properties from materials primitives (such as
composition and crystal structure) without user intervention or hyperparameter
tuning. We test Automatminer on the Matbench test suite and compare its
predictive power with state-of-the-art crystal graph neural networks and a
traditional descriptor-based Random Forest model. We find Automatminer achieves
the best performance on 8 of 13 tasks in the benchmark. We also show our test
suite is capable of exposing predictive advantages of each algorithm - namely,
that crystal graph methods appear to outperform traditional machine learning
methods given ~10^4 or greater data points. The pre-processed, ready-to-use
Matbench tasks and the Automatminer source code are open source and available
online (http://hackingmaterials.lbl.gov/automatminer/). We encourage evaluating
new materials ML algorithms on the MatBench benchmark and comparing them
against the latest version of Automatminer.Comment: Main text, supplemental inf
- …