Interpretable and explainable machine learning has seen a recent surge of
interest. We focus on safety as a key motivation behind the surge and make the
relationship between interpretability and safety more quantitative. Toward
assessing safety, we introduce the concept of maximum deviation via an
optimization problem to find the largest deviation of a supervised learning
model from a reference model regarded as safe. We then show how
interpretability facilitates this safety assessment. For models including
decision trees, generalized linear and additive models, the maximum deviation
can be computed exactly and efficiently. For tree ensembles, which are not
regarded as interpretable, discrete optimization techniques can still provide
informative bounds. For a broader class of piecewise Lipschitz functions, we
leverage the multi-armed bandit literature to show that interpretability
produces tighter (regret) bounds on the maximum deviation. We present case
studies, including one on mortgage approval, to illustrate our methods and the
insights about models that may be obtained from deviation maximization.Comment: Published at NeurIPS 202