90 research outputs found
Supersparse Linear Integer Models for Optimized Medical Scoring Systems
Scoring systems are linear classification models that only require users to
add, subtract and multiply a few small numbers in order to make a prediction.
These models are in widespread use by the medical community, but are difficult
to learn from data because they need to be accurate and sparse, have coprime
integer coefficients, and satisfy multiple operational constraints. We present
a new method for creating data-driven scoring systems called a Supersparse
Linear Integer Model (SLIM). SLIM scoring systems are built by solving an
integer program that directly encodes measures of accuracy (the 0-1 loss) and
sparsity (the -seminorm) while restricting coefficients to coprime
integers. SLIM can seamlessly incorporate a wide range of operational
constraints related to accuracy and sparsity, and can produce highly tailored
models without parameter tuning. We provide bounds on the testing and training
accuracy of SLIM scoring systems, and present a new data reduction technique
that can improve scalability by eliminating a portion of the training data
beforehand. Our paper includes results from a collaboration with the
Massachusetts General Hospital Sleep Laboratory, where SLIM was used to create
a highly tailored scoring system for sleep apnea screeningComment: This version reflects our findings on SLIM as of January 2016
(arXiv:1306.5860 and arXiv:1405.4047 are out-of-date). The final published
version of this articled is available at http://www.springerlink.co
Learning Optimal Fair Scoring Systems for Multi-Class Classification
Machine Learning models are increasingly used for decision making, in
particular in high-stakes applications such as credit scoring, medicine or
recidivism prediction. However, there are growing concerns about these models
with respect to their lack of interpretability and the undesirable biases they
can generate or reproduce. While the concepts of interpretability and fairness
have been extensively studied by the scientific community in recent years, few
works have tackled the general multi-class classification problem under
fairness constraints, and none of them proposes to generate fair and
interpretable models for multi-class classification. In this paper, we use
Mixed-Integer Linear Programming (MILP) techniques to produce inherently
interpretable scoring systems under sparsity and fairness constraints, for the
general multi-class classification setup. Our work generalizes the SLIM
(Supersparse Linear Integer Models) framework that was proposed by Rudin and
Ustun to learn optimal scoring systems for binary classification. The use of
MILP techniques allows for an easy integration of diverse operational
constraints (such as, but not restricted to, fairness or sparsity), but also
for the building of certifiably optimal models (or sub-optimal models with
bounded optimality gap)
Interpretability and Explainability: A Machine Learning Zoo Mini-tour
In this review, we examine the problem of designing interpretable and
explainable machine learning models. Interpretability and explainability lie at
the core of many machine learning and statistical applications in medicine,
economics, law, and natural sciences. Although interpretability and
explainability have escaped a clear universal definition, many techniques
motivated by these properties have been developed over the recent 30 years with
the focus currently shifting towards deep learning methods. In this review, we
emphasise the divide between interpretability and explainability and illustrate
these two different research directions with concrete examples of the
state-of-the-art. The review is intended for a general machine learning
audience with interest in exploring the problems of interpretation and
explanation beyond logistic regression or random forest variable importance.
This work is not an exhaustive literature survey, but rather a primer focusing
selectively on certain lines of research which the authors found interesting or
informative
- …