4 research outputs found
SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates
The lack of reliable methods for identifying descriptors - the sets of
parameters capturing the underlying mechanisms of a materials property - is one
of the key factors hindering efficient materials development. Here, we propose
a systematic approach for discovering descriptors for materials properties,
within the framework of compressed-sensing based dimensionality reduction.
SISSO (sure independence screening and sparsifying operator) tackles immense
and correlated features spaces, and converges to the optimal solution from a
combination of features relevant to the materials' property of interest. In
addition, SISSO gives stable results also with small training sets. The
methodology is benchmarked with the quantitative prediction of the ground-state
enthalpies of octet binary materials (using ab initio data) and applied to the
showcase example of predicting the metal/insulator classification of binaries
(with experimental data). Accurate, predictive models are found in both cases.
For the metal-insulator classification model, the predictive capability are
tested beyond the training data: It rediscovers the available pressure-induced
insulator->metal transitions and it allows for the prediction of yet unknown
transition candidates, ripe for experimental validation. As a step forward with
respect to previous model-identification methods, SISSO can become an effective
tool for automatic materials development.Comment: 11 pages, 5 figures, in press in Phys. Rev. Material
Learning physical descriptors for materials science by compressed sensing
The availability of big data in materials science offers new routes for
analyzing materials properties and functions and achieving scientific
understanding. Finding structure in these data that is not directly visible by
standard tools and exploitation of the scientific information requires new and
dedicated methodology based on approaches from statistical learning, compressed
sensing, and other recent methods from applied mathematics, computer science,
statistics, signal processing, and information science. In this paper, we
explain and demonstrate a compressed-sensing based methodology for feature
selection, specifically for discovering physical descriptors, i.e., physical
parameters that describe the material and its properties of interest, and
associated equations that explicitly and quantitatively describe those relevant
properties. As showcase application and proof of concept, we describe how to
build a physical model for the quantitative prediction of the crystal structure
of binary compound semiconductors