Perovskite materials have become ubiquitous in many technologically relevant
applications, ranging from catalysts in solid oxide fuel cells to light
absorbing layers in solar photovoltaics. The thermodynamic phase stability is a
key parameter that broadly governs whether the material is expected to be
synthesizable, and whether it may degrade under certain operating conditions.
Phase stability can be calculated using Density Functional Theory (DFT), but
the significant computational cost makes such calculation potentially
prohibitive when screening large numbers of possible compounds. In this work,
we developed machine learning models to predict the thermodynamic phase
stability of perovskite oxides using a dataset of more than 1900 DFT-calculated
perovskite oxide energies. The phase stability was determined using convex hull
analysis, with the energy above the convex hull (Ehull) providing a direct
measure of the stability. We generated a set of 791 features based on elemental
property data to correlate with the Ehull value of each perovskite compound.
For classification, the extra trees algorithm achieved the best prediction
accuracy of 0.93 (+/- 0.02), with an F1 score of 0.88 (+/- 0.03). For
regression, leave-out 20% cross-validation tests with kernel ridge regression
achieved the minimal root mean square error (RMSE) of 28.5 (+/- 7.5) meV/atom
between cross-validation predicted Ehull values and DFT calculations, with the
mean absolute error (MAE) in cross-validation energies of 16.7 (+/- 2.3)
meV/atom. We further validated our model by predicting the stability of
compounds not present in the training set and demonstrated our machine learning
models are a fast and effective means of obtaining qualitatively useful
guidance for a wide-range of perovskite oxide stability, potentially impacting
materials design choices in a variety of technological applications.Comment: 32 pages, 6 figures, 5 table