10,006 research outputs found

    Band gap prediction for large organic crystal structures with machine learning

    Full text link
    Machine-learning models are capable of capturing the structure-property relationship from a dataset of computationally demanding ab initio calculations. Over the past two years, the Organic Materials Database (OMDB) has hosted a growing number of calculated electronic properties of previously synthesized organic crystal structures. The complexity of the organic crystals contained within the OMDB, which have on average 82 atoms per unit cell, makes this database a challenging platform for machine learning applications. In this paper, the focus is on predicting the band gap which represents one of the basic properties of a crystalline materials. With this aim, a consistent dataset of 12 500 crystal structures and their corresponding DFT band gap are released, freely available for download at https://omdb.mathub.io/dataset. An ensemble of two state-of-the-art models reach a mean absolute error (MAE) of 0.388 eV, which corresponds to a percentage error of 13% for an average band gap of 3.05 eV. Finally, the trained models are employed to predict the band gap for 260 092 materials contained within the Crystallography Open Database (COD) and made available online so that the predictions can be obtained for any arbitrary crystal structure uploaded by a user.Comment: 10 pages, 6 figure

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF

    Representations of molecules and materials for interpolation of quantum-mechanical simulations via machine learning

    No full text
    Computational study of molecules and materials from first principles is a cornerstone of physics, chemistry and materials science, but limited by the cost of accurate and precise simulations. In settings involving many simulations, machine learning can reduce these costs, sometimes by orders of magnitude, by interpolating between reference simulations. This requires representations that describe any molecule or material and support interpolation. We review, discuss and benchmark state-of-the-art representations and relations between them, including smooth overlap of atomic positions, many-body tensor representation, and symmetry functions. For this, we use a unified mathematical framework based on many-body functions, group averaging and tensor products, and compare energy predictions for organic molecules, binary alloys and Al-Ga-In sesquioxides in numerical experiments controlled for data distribution, regression method and hyper-parameter optimization

    Study of Acid Suppressed Thickener Technology Using Density Functional Theory and Machine Learning Techniques

    Full text link
    Hydrophobically modified ethylene oxide urethane (HEUR) rheology modifiers, which are water-based polyurethane formulations manufactured by Dow Coating Materials, a division of the Dow Chemical Company, are often added to interior and exterior water-based Latex paint formulations to control their viscosity. The thickening efficiency of the HEUR rheol-ogy modifier is controlled by the pH of the solvent, as this affects the protonation-deprotonation equilibrium of the amine hydrophobe group at the end of the rheology modi-fier polymer chain. The principal quantity characterizing this equilibrium is the acid disso-ciation constant (pKa) of the hydrophobe group, which identifies the transition between high and low viscosity of the suspension. To gain a better understanding of the functioning of the hydrophobe molecular groups, and to develop novel hydrophobes that meet specific per-formance characteristics, it is important to accurately predict the pKa based on first princi-ples calculations, and use it as a first evaluation criterion for a rapid screening of candidate hydrophobe molecules. A main source of error in the pKa calculation is the value of solvation free energy of the molecule in its charged state. We therefore develop new methods to increase the accuracy of the solvation free energy calculation for charged species without excessive increase the computational expense. This includes a hybrid cluster-continuum model approach, where explicit solvent molecules are added to the traditionally employed continuum solvation model, and a molecular dynamics (MD sampling procedure that eliminates the costly ener-gy minimization step. Using test molecules for pKa calculations, we systematically exam-ine the convergence behavior in terms of number of explicit water molecules that need to be included in the cluster-continuum model, the influence of the dielectric constant attributed to the continuum, and the placement of a counter ion for charge neutrality for the accurate calculation of the solvation free energy. We establish that the MD sampling method yields results comparable the energy minimization procedure during density functional theory (DFT) calculations, but at 100 times the speed. When calculating the solvation free energy and the pKa calculation of a known hydrophobe, ethoxylated bis(2-ethylhexy)amine, we find that including explicit water molecules and a fragment of the latex polymer in its local en-vironment both significantly improve the results. Finally we develop an informatics-based approach that employs a transferable machine learning (ML) model, trained and validated on a limited amount of experimental data, to predict the solvation free energies of new ionic species at a reasonable computational cost. We compare three different ML methods – linear ridge regression, support vector regression and random forest regression, and find that the model trained by the random forest regres-sion method yields the predictions with the lowest mean absolute error. A feature selection analysis shows that the atomic fraction feature, which reflects the chemical constitution of the hydrophobe, plays the most important role in the solvation free energy prediction. Add-ing the Wiener index, a measure of the molecular topology, and the solvent accessible sur-face area of the molecules further improve the performance of the model. Accordingly, our ML model predicts the solvation energies of ionic species, including our test hydrophobe molecule, with similar accuracy as atomistic modeling using first-principles calculations.PHDMaterials Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145967/1/wuwenkun_1.pd

    Hierarchical Visualization of Materials Space with Graph Convolutional Neural Networks

    Full text link
    The combination of high throughput computation and machine learning has led to a new paradigm in materials design by allowing for the direct screening of vast portions of structural, chemical, and property space. The use of these powerful techniques leads to the generation of enormous amounts of data, which in turn calls for new techniques to efficiently explore and visualize the materials space to help identify underlying patterns. In this work, we develop a unified framework to hierarchically visualize the compositional and structural similarities between materials in an arbitrary material space with representations learned from different layers of graph convolutional neural networks. We demonstrate the potential for such a visualization approach by showing that patterns emerge automatically that reflect similarities at different scales in three representative classes of materials: perovskites, elemental boron, and general inorganic crystals, covering material spaces of different compositions, structures, and both. For perovskites, elemental similarities are learned that reflects multiple aspects of atom properties. For elemental boron, structural motifs emerge automatically showing characteristic boron local environments. For inorganic crystals, the similarity and stability of local coordination environments are shown combining different center and neighbor atoms. The method could help transition to a data-centered exploration of materials space in automated materials design.Comment: 22 + 7 pages, 6 + 5 figure
    corecore