2 research outputs found

    A robust algorithm for explaining unreliable machine learning survival models using the Kolmogorov-Smirnov bounds

    Full text link
    A new robust algorithm based of the explanation method SurvLIME called SurvLIME-KS is proposed for explaining machine learning survival models. The algorithm is developed to ensure robustness to cases of a small amount of training data or outliers of survival data. The first idea behind SurvLIME-KS is to apply the Cox proportional hazards model to approximate the black-box survival model at the local area around a test example due to the linear relationship of covariates in the model. The second idea is to incorporate the well-known Kolmogorov-Smirnov bounds for constructing sets of predicted cumulative hazard functions. As a result, the robust maximin strategy is used, which aims to minimize the average distance between cumulative hazard functions of the explained black-box model and of the approximating Cox model, and to maximize the distance over all cumulative hazard functions in the interval produced by the Kolmogorov-Smirnov bounds. The maximin optimization problem is reduced to the quadratic program. Various numerical experiments with synthetic and real datasets demonstrate the SurvLIME-KS efficiency

    An Imprecise SHAP as a Tool for Explaining the Class Probability Distributions under Limited Training Data

    Full text link
    One of the most popular methods of the machine learning prediction explanation is the SHapley Additive exPlanations method (SHAP). An imprecise SHAP as a modification of the original SHAP is proposed for cases when the class probability distributions are imprecise and represented by sets of distributions. The first idea behind the imprecise SHAP is a new approach for computing the marginal contribution of a feature, which fulfils the important efficiency property of Shapley values. The second idea is an attempt to consider a general approach to calculating and reducing interval-valued Shapley values, which is similar to the idea of reachable probability intervals in the imprecise probability theory. A simple special implementation of the general approach in the form of linear optimization problems is proposed, which is based on using the Kolmogorov-Smirnov distance and imprecise contamination models. Numerical examples with synthetic and real data illustrate the imprecise SHAP
    corecore