A machine-learning
model is developed that can accurately predict
the band gap of inorganic solids based only on composition. This method
uses support vector classification to first separate metals from nonmetals,
followed by quantitatively predicting the band gap of the nonmetals
using support vector regression. The superb accuracy of the regression
model is obtained by using a training set composed entirely of experimentally
measured band gaps and utilizing only compositional descriptors. In
fact, because of the unique training set of experimental data, the
machine learning predicted band gaps are significantly closer to the
experimentally reported values than DFT (PBE-level) calculated band
gaps. Not only does this resulting tool provide the ability to accurately
predict the band gap for any composition but also the versatility
and speed of the prediction based only on composition will make this
a great resource to screen inorganic phase space and direct the development
of functional inorganic materials