5 research outputs found
Constraint Reasoning and Kernel Clustering for Pattern Decomposition with Scaling
Abstract. Motivated by an important and challenging task encountered in material discovery, we consider the problem of finding K basis patterns of numbers that jointly compose N observed patterns while enforcing additional spatial and scaling constraints. We propose a Constraint Pro-gramming (CP) model which captures the exact problem structure yet fails to scale in the presence of noisy data about the patterns. We allevi-ate this issue by employing Machine Learning (ML) techniques, namely kernel methods and clustering, to decompose the problem into smaller ones based on a global data-driven view, and then stitch the partial solu-tions together using a global CP model. Combining the complementary strengths of CP and ML techniques yields a more accurate and scalable method than the few found in the literature for this complex problem.
Perspective: Composition–structure–property mapping in high-throughput experiments: Turning data into knowledge
With their ability to rapidly elucidate composition-structure-property relationships, high-throughput experimental studies have revolutionized how materials are discovered, optimized, and commercialized. It is now possible to synthesize and characterize high-throughput libraries that systematically address thousands of individual cuts of fabrication parameter space. An unresolved issue remains transforming structural characterization data into phase mappings. This difficulty is related to the complex information present in diffraction and spectroscopic data and its variation with composition and processing. We review the field of automated phase diagram attribution and discuss the impact that emerging computational approaches will have in the generation of phase diagrams and beyond
Fast and interpretable classification of small X-ray diffraction datasets using data augmentation and deep neural networks
X-ray diffraction (XRD) data acquisition and analysis is among the most
time-consuming steps in the development cycle of novel thin-film materials. We
propose a machine-learning-enabled approach to predict crystallographic
dimensionality and space group from a limited number of thin-film XRD patterns.
We overcome the scarce-data problem intrinsic to novel materials development by
coupling a supervised machine learning approach with a model agnostic,
physics-informed data augmentation strategy using simulated data from the
Inorganic Crystal Structure Database (ICSD) and experimental data. As a test
case, 115 thin-film metal halides spanning 3 dimensionalities and 7
space-groups are synthesized and classified. After testing various algorithms,
we develop and implement an all convolutional neural network, with cross
validated accuracies for dimensionality and space-group classification of 93%
and 89%, respectively. We propose average class activation maps, computed from
a global average pooling layer, to allow high model interpretability by human
experimentalists, elucidating the root causes of misclassification. Finally, we
systematically evaluate the maximum XRD pattern step size (data acquisition
rate) before loss of predictive accuracy occurs, and determine it to be
0.16{\deg}, which enables an XRD pattern to be obtained and classified in 5.5
minutes or less.Comment: Accepted with minor revisions in npj Computational Materials,
Presented in NIPS 2018 Workshop: Machine Learning for Molecules and Material
Automated Phase Mapping with AgileFD and its Application to Light Absorber Discovery in the V-Mn-Nb Oxide System
Rapid construction of phase diagrams is a central tenet of combinatorial materials science with accelerated materials discovery efforts often hampered by challenges in interpreting combinatorial x-ray diffraction datasets, which we address by developing AgileFD, an artificial intelligence algorithm that enables rapid phase mapping from a combinatorial library of x-ray diffraction patterns. AgileFD models alloying-based peak shifting through a novel expansion of convolutional nonnegative matrix factorization, which not only improves the identification of constituent phases but also maps their concentration and lattice parameter as a function of composition. By incorporating Gibbs’ phase rule into the algorithm, physically meaningful phase maps are obtained with unsupervised operation, and more refined solutions are attained by injecting expert knowledge of the system. The algorithm is demonstrated through investigation of the V-Mn-Nb oxide system where decomposition of eight oxide phases, including two with substantial alloying, provides the first phase map for this pseudo-ternary system. This phase map enables interpretation of high-throughput band gap data, leading to the discovery of new solar light absorbers and the alloying-based tuning of the direct-allowed band-gap energy of MnV2O6. The open-source family of AgileFD algorithms can be implemented into a broad range of high throughput workflows to accelerate materials discovery