Search CORE

44 research outputs found

Unified Representation of Molecules and Crystals for Machine Learning

Author: Huo Haoyan
Rupp Matthias
Publication venue
Publication date: 02/01/2018
Field of study

Accurate simulations of atomistic systems from first principles are limited by computational cost. In high-throughput settings, machine learning can potentially reduce these costs significantly by accurately interpolating between reference calculations. For this, kernel learning approaches crucially require a single Hilbert space accommodating arbitrary atomistic systems. We introduce a many-body tensor representation that is invariant to translations, rotations and nuclear permutations of same elements, unique, differentiable, can represent molecules and crystals, and is fast to compute. Empirical evidence is presented for energy prediction errors below 1 kcal/mol for 7k organic molecules and 5 meV/atom for 11k elpasolite crystals. Applicability is demonstrated for phase diagrams of Pt-group/transition-metal binary systems.Comment: Revised version, minor changes throughou

arXiv.org e-Print Archive

KOPS - The Institutional Repository of the University of Konstanz

MPG.PuRe

Recommended from our members

Author Correction: Text-mined dataset of inorganic materials synthesis recipes.

Author: Botari Tiago
Ceder Gerbrand
He Tanjin
Huo Haoyan
Kononova Olga
Rong Ziqin
Sun Wenhao
Tshitoyan Vahe
Publication venue: eScholarship, University of California
Publication date: 01/11/2019
Field of study

An amendment to this paper has been published and can be accessed via a link at the top of the paper

eScholarship - University of California

Machine-learning rationalization and prediction of solid-state synthesis conditions

Author: Bartel Christopher J.
Ceder Gerbrand
Dunn Alexander
He Tanjin
Huo Haoyan
Jain Anubhav
Ouyang Bin
Trewartha Amalie
Publication venue
Publication date: 17/04/2022
Field of study

There currently exist no quantitative methods to determine the appropriate conditions for solid-state synthesis. This not only hinders the experimental realization of novel materials but also complicates the interpretation and understanding of solid-state reaction mechanisms. Here, we demonstrate a machine-learning approach that predicts synthesis conditions using large solid-state synthesis datasets text-mined from scientific journal articles. Using feature importance ranking analysis, we discovered that optimal heating temperatures have strong correlations with the stability of precursor materials quantified using melting points and formation energies (

\Delta G_f

\Delta H_f

). In contrast, features derived from the thermodynamics of synthesis-related reactions did not directly correlate to the chosen heating temperatures. This correlation between optimal solid-state heating temperature and precursor stability extends Tamman's rule from intermetallics to oxide systems, suggesting the importance of reaction kinetics in determining synthesis conditions. Heating times are shown to be strongly correlated with the chosen experimental procedures and instrument setups, which may be indicative of human bias in the dataset. Using these predictive features, we constructed machine-learning models with good performance and general applicability to predict the conditions required to synthesize diverse chemical systems. Codes and data used in this work can be found at: https://github.com/CederGroupHub/s4

arXiv.org e-Print Archive

PubMed Central

eScholarship - University of California

Semi-supervised machine-learning classification of materials synthesis procedures

Author: Huo Haoyan,
Publication venue
Publication date: 22/08/2019
Field of study

Ezid

Text-mining and machine-learning solid-state synthesis from the scientific literature

Author: Huo Haoyan
Publication venue
Publication date: 01/05/2022
Field of study

Ezid

Machine-Learning Rationalization and Prediction of Solid-State Synthesis Conditions.

Author: Huo Haoyan,
Publication venue
Publication date: 25/10/2022
Field of study

Ezid

Text-mining and machine-learning solid-state synthesis from the scientific literature

Author: Huo Haoyan
Publication venue
Publication date: 01/01/2022
Field of study

Innovations of novel materials often involve synthesizing new compounds with better materials properties. However, computationally designing synthesis methods for these new compounds remains an uncharted new area of research. This thesis proposes to use machine-learning approaches to predict materials synthesis routes by training on synthesis information from the published scientific literature. However, most inorganic materials synthesis information in the scientific literature is locked-up in written natural language and must be parsed using natural language processing and information retrieval techniques. Therefore, this thesis aims to achieve two objectives: 1) constructing a text-mining pipeline that extracts solid-state synthesis datasets from scientific papers, and 2) implementing an interpretable machine-learning method to predict solid-state synthesis conditions.Training information retrieval systems usually requires large manually labeled datasets, which are not widely available in materials informatics. To alleviate the lack of labeled datasets, we demonstrate a semi-supervised machine-learning method (Chapter 3), which is implemented for the classification of paragraphs in papers. Without any human labeling efforts, latent Dirichlet allocation can cluster keywords into topics corresponding to specific experimental synthesis steps. Guided by a small amount of annotation, supervised training methods, such as random forest, can then associate these steps with different synthesis methods, such as solid-state or hydrothermal synthesis. Using the topic modeling results, we also show a Markov chain representation of the order of experimental steps, which reconstructs a flowchart of synthesis procedures.To fulfill the first objective, we have extracted a dataset of "codified recipes" for solid-state synthesis using an automated text-mining pipeline (Chapter 4). The dataset currently consists of over 30,000 solid-state synthesis entries. Every entry contains synthesis information including input materials, target materials, experimental operations, the associated processing parameters and synthesis conditions, and the balanced synthesis reaction equation. This dataset is the first-ever collection of machine-readable solid-state synthesis experiments and enables data mining of various aspects of inorganic materials synthesis.To fulfill the second objective, we have built a machine-learning approach that predicts solid-state synthesis conditions (heating temperature and heating time) using the above-mentioned dataset (Chapter 5). We used dominance importance ranking analysis and discovered that optimal heating temperatures have strong correlations with the stability of precursor materials. This correlation extends Tamman's rule from intermetallics to oxide systems, suggesting the importance of reaction kinetics in solid-state synthesis. Heating times are shown to be strongly correlated with the chosen experimental procedures and instrument setups, which may be indicative of the selection bias in the dataset. Our machine-learning models achieve good synthesis prediction performance and general applicability for diverse chemical systems. While focusing particularly on solid-state synthesis, this thesis demonstrates a scalable framework to unlock the large amount of inorganic materials synthesis information from the literature, and machine-learn robust and interpretable synthesis predictors. At the end of this thesis, we outline several interesting future research topics which expand the work into a broader context of materials informatics and synthesis science

Ezid

eScholarship - University of California

Unified representation of molecules and crystals for machine learning

Author: Huo Haoyan
Rupp Matthias
Publication venue: 'IOP Publishing'
Publication date: 01/01/2022
Field of study

Accurate simulations of atomistic systems from first principles are limited by computational cost. In high-throughput settings, machine learning can reduce these costs significantly by accurately interpolating between reference calculations. For this, kernel learning approaches crucially require a representation that accommodates arbitrary atomistic systems. We introduce a many-body tensor representation that is invariant to translations, rotations, and nuclear permutations of same elements, unique, differentiable, can represent molecules and crystals, and is fast to compute. Empirical evidence for competitive energy and force prediction errors is presented for changes in molecular structure, crystal chemistry, and molecular dynamics using kernel regression and symmetric gradient-domain machine learning as models. Applicability is demonstrated for phase diagrams of Pt-group/transition-metal binary systems.publishe

KOPS - The Institutional Repository of the University of Konstanz

Hydrophobicity Improvement of Cement-Based Materials Incorporated with Ionic Paraffin Emulsions (IPEs)

Author: Haoyan Guo
Jinyang Huo
Yongfeng Wei
Zhenjun Wang
Publication venue: 'MDPI AG'
Publication date: 20/07/2020
Field of study

Cement-based materials are non-uniform porous materials that are easily permeated by harmful substances, thereby deteriorating their structural durability. In this work, three ionic paraffin emulsions (IPEs) (i.e., anionic paraffin emulsion (APE), cationic paraffin emulsion (CPE), and non-ionic paraffin emulsion (NPE), respectively) were prepared. The effects of incorporation of IPEs into cement-based materials on hydrophobicity improvement were investigated by environmental scanning electron microscopy (ESEM), Fourier transform infrared (FTIR) spectroscopy, transmission and reflection polarizing microscope (TRPM) tests and correlation analyses, as well as by compressive strength, impermeability, and apparent contact angle tests. Finally, the optimal type and the recommended dose of IPEs were suggested. Results reveal that the impermeability pressure and apparent contact angle value of cement-based materials incorporated with IPEs are significantly higher than those of the control group. Thus, the hydrophobicity of cement-based materials is significantly improved. However, IPEs adversely affect the compressive strength of cement-based materials. The apparent contact angle mainly affects impermeability. These three IPEs impart hydrophobicity to cement-based materials. In addition, the optimal NPE dose can significantly improve the hydrophobicity of cement-based materials

Multidisciplinary Digital Publishing Institute

Synthetic accessibility and stability rules of NASICONs

Author: Bartel Christopher J.
Ceder Gerbrand
He Tanjin
Huo Haoyan
Kim Haegyeom
Lacivita Valentina
Ouyang Bin
Wang Jingyang
Wang Yan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/09/2021
Field of study

In this paper we develop the stability rules for NASICON structured materials, as an example of compounds with complex bond topology and composition. By applying machine learning to the ab-initio computed phase stability of 3881 potential NASICONs we can extract a simple two-dimensional descriptor that is extremely good at separating stable from unstable NASICONS. This machine-learned "tolerance factor" contains information on the Na content, the radii and electronegativities of the elements, and the Madelung energy. We test the predictive capability of this approach by selecting six predicted NASICON compositions. Five out of the six resulted in a phase pure NASICON while the sixth composition led to a NASICON that coexisted with other phases, validating the efficacy of this approach. This work not only provide tools to understand synthetic accessibility of NASICON-type materials, but also demonstrate an efficient paradigm for discovering new materials with complicate composition and atomic structure

arXiv.org e-Print Archive

eScholarship - University of California