Rule-based surrogate models are an effective and interpretable way to
approximate a Deep Neural Network's (DNN) decision boundaries, allowing humans
to easily understand deep learning models. Current state-of-the-art
decompositional methods, which are those that consider the DNN's latent space
to extract more exact rule sets, manage to derive rule sets at high accuracy.
However, they a) do not guarantee that the surrogate model has learned from the
same variables as the DNN (alignment), b) only allow to optimise for a single
objective, such as accuracy, which can result in excessively large rule sets
(complexity), and c) use decision tree algorithms as intermediate models, which
can result in different explanations for the same DNN (stability). This paper
introduces the CGX (Column Generation eXplainer) to address these limitations -
a decompositional method using dual linear programming to extract rules from
the hidden representations of the DNN. This approach allows to optimise for any
number of objectives and empowers users to tweak the explanation model to their
needs. We evaluate our results on a wide variety of tasks and show that CGX
meets all three criteria, by having exact reproducibility of the explanation
model that guarantees stability and reduces the rule set size by >80%
(complexity) at equivalent or improved accuracy and fidelity across tasks
(alignment).Comment: Accepted at ICLR 2023 Workshop on Trustworthy Machine Learning for
Healthcar