5 research outputs found
A weighted random survival forest
A weighted random survival forest is presented in the paper. It can be
regarded as a modification of the random forest improving its performance. The
main idea underlying the proposed model is to replace the standard procedure of
averaging used for estimation of the random survival forest hazard function by
weighted avaraging where the weights are assigned to every tree and can be
veiwed as training paremeters which are computed in an optimal way by solving a
standard quadratic optimization problem maximizing Harrell's C-index. Numerical
examples with real data illustrate the outperformance of the proposed model in
comparison with the original random survival forest
A robust algorithm for explaining unreliable machine learning survival models using the Kolmogorov-Smirnov bounds
A new robust algorithm based of the explanation method SurvLIME called
SurvLIME-KS is proposed for explaining machine learning survival models. The
algorithm is developed to ensure robustness to cases of a small amount of
training data or outliers of survival data. The first idea behind SurvLIME-KS
is to apply the Cox proportional hazards model to approximate the black-box
survival model at the local area around a test example due to the linear
relationship of covariates in the model. The second idea is to incorporate the
well-known Kolmogorov-Smirnov bounds for constructing sets of predicted
cumulative hazard functions. As a result, the robust maximin strategy is used,
which aims to minimize the average distance between cumulative hazard functions
of the explained black-box model and of the approximating Cox model, and to
maximize the distance over all cumulative hazard functions in the interval
produced by the Kolmogorov-Smirnov bounds. The maximin optimization problem is
reduced to the quadratic program. Various numerical experiments with synthetic
and real datasets demonstrate the SurvLIME-KS efficiency
SurvLIME-Inf: A simplified modification of SurvLIME for explanation of machine learning survival models
A new modification of the explanation method SurvLIME called SurvLIME-Inf for
explaining machine learning survival models is proposed. The basic idea behind
SurvLIME as well as SurvLIME-Inf is to apply the Cox proportional hazards model
to approximate the black-box survival model at the local area around a test
example. The Cox model is used due to the linear relationship of covariates. In
contrast to SurvLIME, the proposed modification uses -norm for
defining distances between approximating and approximated cumulative hazard
functions. This leads to a simple linear programming problem for determining
important features and for explaining the black-box model prediction. Moreover,
SurvLIME-Inf outperforms SurvLIME when the training set is very small.
Numerical experiments with synthetic and real datasets demonstrate the
SurvLIME-Inf efficiency.Comment: arXiv admin note: substantial text overlap with arXiv:2003.08371,
arXiv:2005.0224
SurvLIME: A method for explaining machine learning survival models
A new method called SurvLIME for explaining machine learning survival models
is proposed. It can be viewed as an extension or modification of the well-known
method LIME. The main idea behind the proposed method is to apply the Cox
proportional hazards model to approximate the survival model at the local area
around a test example. The Cox model is used because it considers a linear
combination of the example covariates such that coefficients of the covariates
can be regarded as quantitative impacts on the prediction. Another idea is to
approximate cumulative hazard functions of the explained model and the Cox
model by using a set of perturbed points in a local area around the point of
interest. The method is reduced to solving an unconstrained convex optimization
problem. A lot of numerical experiments demonstrate the SurvLIME efficiency
Counterfactual explanation of machine learning survival models
A method for counterfactual explanation of machine learning survival models
is proposed. One of the difficulties of solving the counterfactual explanation
problem is that the classes of examples are implicitly defined through outcomes
of a machine learning survival model in the form of survival functions. A
condition that establishes the difference between survival functions of the
original example and the counterfactual is introduced. This condition is based
on using a distance between mean times to event. It is shown that the
counterfactual explanation problem can be reduced to a standard convex
optimization problem with linear constraints when the explained black-box model
is the Cox model. For other black-box models, it is proposed to apply the
well-known Particle Swarm Optimization algorithm. A lot of numerical
experiments with real and synthetic data demonstrate the proposed method.Comment: arXiv admin note: text overlap with arXiv:2005.0224