10,006 research outputs found
Band gap prediction for large organic crystal structures with machine learning
Machine-learning models are capable of capturing the structure-property
relationship from a dataset of computationally demanding ab initio
calculations. Over the past two years, the Organic Materials Database (OMDB)
has hosted a growing number of calculated electronic properties of previously
synthesized organic crystal structures. The complexity of the organic crystals
contained within the OMDB, which have on average 82 atoms per unit cell, makes
this database a challenging platform for machine learning applications. In this
paper, the focus is on predicting the band gap which represents one of the
basic properties of a crystalline materials. With this aim, a consistent
dataset of 12 500 crystal structures and their corresponding DFT band gap are
released, freely available for download at https://omdb.mathub.io/dataset. An
ensemble of two state-of-the-art models reach a mean absolute error (MAE) of
0.388 eV, which corresponds to a percentage error of 13% for an average band
gap of 3.05 eV. Finally, the trained models are employed to predict the band
gap for 260 092 materials contained within the Crystallography Open Database
(COD) and made available online so that the predictions can be obtained for any
arbitrary crystal structure uploaded by a user.Comment: 10 pages, 6 figure
Machine Learning, Quantum Mechanics, and Chemical Compound Space
We review recent studies dealing with the generation of machine learning
models of molecular and solid properties. The models are trained and validated
using standard quantum chemistry results obtained for organic molecules and
materials selected from chemical space at random
Representations of molecules and materials for interpolation of quantum-mechanical simulations via machine learning
Computational study of molecules and materials from first principles is a cornerstone of physics, chemistry and materials science, but limited by the cost of accurate and precise simulations. In settings involving many simulations, machine learning can reduce these costs, sometimes by orders of magnitude, by interpolating between reference simulations. This requires representations that describe any molecule or material and support interpolation. We review, discuss and benchmark state-of-the-art representations and relations between them, including smooth overlap of atomic positions, many-body tensor representation, and symmetry functions. For this, we use a unified mathematical framework based on many-body functions, group averaging and tensor products, and compare energy predictions for organic molecules, binary alloys and Al-Ga-In sesquioxides in numerical experiments controlled for data distribution, regression method and hyper-parameter optimization
Study of Acid Suppressed Thickener Technology Using Density Functional Theory and Machine Learning Techniques
Hydrophobically modified ethylene oxide urethane (HEUR) rheology modifiers, which are water-based polyurethane formulations manufactured by Dow Coating Materials, a division of the Dow Chemical Company, are often added to interior and exterior water-based Latex paint formulations to control their viscosity. The thickening efficiency of the HEUR rheol-ogy modifier is controlled by the pH of the solvent, as this affects the protonation-deprotonation equilibrium of the amine hydrophobe group at the end of the rheology modi-fier polymer chain. The principal quantity characterizing this equilibrium is the acid disso-ciation constant (pKa) of the hydrophobe group, which identifies the transition between high and low viscosity of the suspension. To gain a better understanding of the functioning of the hydrophobe molecular groups, and to develop novel hydrophobes that meet specific per-formance characteristics, it is important to accurately predict the pKa based on first princi-ples calculations, and use it as a first evaluation criterion for a rapid screening of candidate hydrophobe molecules.
A main source of error in the pKa calculation is the value of solvation free energy of the molecule in its charged state. We therefore develop new methods to increase the accuracy of the solvation free energy calculation for charged species without excessive increase the computational expense. This includes a hybrid cluster-continuum model approach, where explicit solvent molecules are added to the traditionally employed continuum solvation model, and a molecular dynamics (MD sampling procedure that eliminates the costly ener-gy minimization step. Using test molecules for pKa calculations, we systematically exam-ine the convergence behavior in terms of number of explicit water molecules that need to be included in the cluster-continuum model, the influence of the dielectric constant attributed to the continuum, and the placement of a counter ion for charge neutrality for the accurate calculation of the solvation free energy. We establish that the MD sampling method yields results comparable the energy minimization procedure during density functional theory (DFT) calculations, but at 100 times the speed. When calculating the solvation free energy and the pKa calculation of a known hydrophobe, ethoxylated bis(2-ethylhexy)amine, we find that including explicit water molecules and a fragment of the latex polymer in its local en-vironment both significantly improve the results.
Finally we develop an informatics-based approach that employs a transferable machine learning (ML) model, trained and validated on a limited amount of experimental data, to predict the solvation free energies of new ionic species at a reasonable computational cost. We compare three different ML methods – linear ridge regression, support vector regression and random forest regression, and find that the model trained by the random forest regres-sion method yields the predictions with the lowest mean absolute error. A feature selection analysis shows that the atomic fraction feature, which reflects the chemical constitution of the hydrophobe, plays the most important role in the solvation free energy prediction. Add-ing the Wiener index, a measure of the molecular topology, and the solvent accessible sur-face area of the molecules further improve the performance of the model. Accordingly, our ML model predicts the solvation energies of ionic species, including our test hydrophobe molecule, with similar accuracy as atomistic modeling using first-principles calculations.PHDMaterials Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145967/1/wuwenkun_1.pd
Hierarchical Visualization of Materials Space with Graph Convolutional Neural Networks
The combination of high throughput computation and machine learning has led
to a new paradigm in materials design by allowing for the direct screening of
vast portions of structural, chemical, and property space. The use of these
powerful techniques leads to the generation of enormous amounts of data, which
in turn calls for new techniques to efficiently explore and visualize the
materials space to help identify underlying patterns. In this work, we develop
a unified framework to hierarchically visualize the compositional and
structural similarities between materials in an arbitrary material space with
representations learned from different layers of graph convolutional neural
networks. We demonstrate the potential for such a visualization approach by
showing that patterns emerge automatically that reflect similarities at
different scales in three representative classes of materials: perovskites,
elemental boron, and general inorganic crystals, covering material spaces of
different compositions, structures, and both. For perovskites, elemental
similarities are learned that reflects multiple aspects of atom properties. For
elemental boron, structural motifs emerge automatically showing characteristic
boron local environments. For inorganic crystals, the similarity and stability
of local coordination environments are shown combining different center and
neighbor atoms. The method could help transition to a data-centered exploration
of materials space in automated materials design.Comment: 22 + 7 pages, 6 + 5 figure
- …