44 research outputs found

    Unified Representation of Molecules and Crystals for Machine Learning

    Get PDF
    Accurate simulations of atomistic systems from first principles are limited by computational cost. In high-throughput settings, machine learning can potentially reduce these costs significantly by accurately interpolating between reference calculations. For this, kernel learning approaches crucially require a single Hilbert space accommodating arbitrary atomistic systems. We introduce a many-body tensor representation that is invariant to translations, rotations and nuclear permutations of same elements, unique, differentiable, can represent molecules and crystals, and is fast to compute. Empirical evidence is presented for energy prediction errors below 1 kcal/mol for 7k organic molecules and 5 meV/atom for 11k elpasolite crystals. Applicability is demonstrated for phase diagrams of Pt-group/transition-metal binary systems.Comment: Revised version, minor changes throughou

    Machine-learning rationalization and prediction of solid-state synthesis conditions

    Full text link
    There currently exist no quantitative methods to determine the appropriate conditions for solid-state synthesis. This not only hinders the experimental realization of novel materials but also complicates the interpretation and understanding of solid-state reaction mechanisms. Here, we demonstrate a machine-learning approach that predicts synthesis conditions using large solid-state synthesis datasets text-mined from scientific journal articles. Using feature importance ranking analysis, we discovered that optimal heating temperatures have strong correlations with the stability of precursor materials quantified using melting points and formation energies (ΔGf\Delta G_f, ΔHf\Delta H_f). In contrast, features derived from the thermodynamics of synthesis-related reactions did not directly correlate to the chosen heating temperatures. This correlation between optimal solid-state heating temperature and precursor stability extends Tamman's rule from intermetallics to oxide systems, suggesting the importance of reaction kinetics in determining synthesis conditions. Heating times are shown to be strongly correlated with the chosen experimental procedures and instrument setups, which may be indicative of human bias in the dataset. Using these predictive features, we constructed machine-learning models with good performance and general applicability to predict the conditions required to synthesize diverse chemical systems. Codes and data used in this work can be found at: https://github.com/CederGroupHub/s4

    Text-mining and machine-learning solid-state synthesis from the scientific literature

    No full text
    Innovations of novel materials often involve synthesizing new compounds with better materials properties. However, computationally designing synthesis methods for these new compounds remains an uncharted new area of research. This thesis proposes to use machine-learning approaches to predict materials synthesis routes by training on synthesis information from the published scientific literature. However, most inorganic materials synthesis information in the scientific literature is locked-up in written natural language and must be parsed using natural language processing and information retrieval techniques. Therefore, this thesis aims to achieve two objectives: 1) constructing a text-mining pipeline that extracts solid-state synthesis datasets from scientific papers, and 2) implementing an interpretable machine-learning method to predict solid-state synthesis conditions.Training information retrieval systems usually requires large manually labeled datasets, which are not widely available in materials informatics. To alleviate the lack of labeled datasets, we demonstrate a semi-supervised machine-learning method (Chapter 3), which is implemented for the classification of paragraphs in papers. Without any human labeling efforts, latent Dirichlet allocation can cluster keywords into topics corresponding to specific experimental synthesis steps. Guided by a small amount of annotation, supervised training methods, such as random forest, can then associate these steps with different synthesis methods, such as solid-state or hydrothermal synthesis. Using the topic modeling results, we also show a Markov chain representation of the order of experimental steps, which reconstructs a flowchart of synthesis procedures.To fulfill the first objective, we have extracted a dataset of "codified recipes" for solid-state synthesis using an automated text-mining pipeline (Chapter 4). The dataset currently consists of over 30,000 solid-state synthesis entries. Every entry contains synthesis information including input materials, target materials, experimental operations, the associated processing parameters and synthesis conditions, and the balanced synthesis reaction equation. This dataset is the first-ever collection of machine-readable solid-state synthesis experiments and enables data mining of various aspects of inorganic materials synthesis.To fulfill the second objective, we have built a machine-learning approach that predicts solid-state synthesis conditions (heating temperature and heating time) using the above-mentioned dataset (Chapter 5). We used dominance importance ranking analysis and discovered that optimal heating temperatures have strong correlations with the stability of precursor materials. This correlation extends Tamman's rule from intermetallics to oxide systems, suggesting the importance of reaction kinetics in solid-state synthesis. Heating times are shown to be strongly correlated with the chosen experimental procedures and instrument setups, which may be indicative of the selection bias in the dataset. Our machine-learning models achieve good synthesis prediction performance and general applicability for diverse chemical systems. While focusing particularly on solid-state synthesis, this thesis demonstrates a scalable framework to unlock the large amount of inorganic materials synthesis information from the literature, and machine-learn robust and interpretable synthesis predictors. At the end of this thesis, we outline several interesting future research topics which expand the work into a broader context of materials informatics and synthesis science

    Unified representation of molecules and crystals for machine learning

    No full text
    Accurate simulations of atomistic systems from first principles are limited by computational cost. In high-throughput settings, machine learning can reduce these costs significantly by accurately interpolating between reference calculations. For this, kernel learning approaches crucially require a representation that accommodates arbitrary atomistic systems. We introduce a many-body tensor representation that is invariant to translations, rotations, and nuclear permutations of same elements, unique, differentiable, can represent molecules and crystals, and is fast to compute. Empirical evidence for competitive energy and force prediction errors is presented for changes in molecular structure, crystal chemistry, and molecular dynamics using kernel regression and symmetric gradient-domain machine learning as models. Applicability is demonstrated for phase diagrams of Pt-group/transition-metal binary systems.publishe

    Hydrophobicity Improvement of Cement-Based Materials Incorporated with Ionic Paraffin Emulsions (IPEs)

    No full text
    Cement-based materials are non-uniform porous materials that are easily permeated by harmful substances, thereby deteriorating their structural durability. In this work, three ionic paraffin emulsions (IPEs) (i.e., anionic paraffin emulsion (APE), cationic paraffin emulsion (CPE), and non-ionic paraffin emulsion (NPE), respectively) were prepared. The effects of incorporation of IPEs into cement-based materials on hydrophobicity improvement were investigated by environmental scanning electron microscopy (ESEM), Fourier transform infrared (FTIR) spectroscopy, transmission and reflection polarizing microscope (TRPM) tests and correlation analyses, as well as by compressive strength, impermeability, and apparent contact angle tests. Finally, the optimal type and the recommended dose of IPEs were suggested. Results reveal that the impermeability pressure and apparent contact angle value of cement-based materials incorporated with IPEs are significantly higher than those of the control group. Thus, the hydrophobicity of cement-based materials is significantly improved. However, IPEs adversely affect the compressive strength of cement-based materials. The apparent contact angle mainly affects impermeability. These three IPEs impart hydrophobicity to cement-based materials. In addition, the optimal NPE dose can significantly improve the hydrophobicity of cement-based materials

    Synthetic accessibility and stability rules of NASICONs

    Full text link
    In this paper we develop the stability rules for NASICON structured materials, as an example of compounds with complex bond topology and composition. By applying machine learning to the ab-initio computed phase stability of 3881 potential NASICONs we can extract a simple two-dimensional descriptor that is extremely good at separating stable from unstable NASICONS. This machine-learned "tolerance factor" contains information on the Na content, the radii and electronegativities of the elements, and the Madelung energy. We test the predictive capability of this approach by selecting six predicted NASICON compositions. Five out of the six resulted in a phase pure NASICON while the sixth composition led to a NASICON that coexisted with other phases, validating the efficacy of this approach. This work not only provide tools to understand synthetic accessibility of NASICON-type materials, but also demonstrate an efficient paradigm for discovering new materials with complicate composition and atomic structure
    corecore