365 research outputs found
Nested conformal prediction and quantile out-of-bag ensemble methods
Conformal prediction is a popular tool for providing valid prediction sets
for classification and regression problems, without relying on any
distributional assumptions on the data. While the traditional description of
conformal prediction starts with a nonconformity score, we provide an alternate
(but equivalent) view that starts with a sequence of nested sets and calibrates
them to find a valid prediction set. The nested framework subsumes all
nonconformity scores, including recent proposals based on quantile regression
and density estimation. While these ideas were originally derived based on
sample splitting, our framework seamlessly extends them to other aggregation
schemes like cross-conformal, jackknife+ and out-of-bag methods. We use the
framework to derive a new algorithm (QOOB, pronounced cube) that combines four
ideas: quantile regression, cross-conformalization, ensemble methods and
out-of-bag predictions. We develop a computationally efficient implementation
of cross-conformal, that is also used by QOOB. In a detailed numerical
investigation, QOOB performs either the best or close to the best on all
simulated and real datasets.Comment: 38 pages, 5 figures, 8 table
Calibrated Explanations for Regression
Artificial Intelligence (AI) is often an integral part of modern decision
support systems (DSSs). The best-performing predictive models used in AI-based
DSSs lack transparency. Explainable Artificial Intelligence (XAI) aims to
create AI systems that can explain their rationale to human users. Local
explanations in XAI can provide information about the causes of individual
predictions in terms of feature importance. However, a critical drawback of
existing local explanation methods is their inability to quantify the
uncertainty associated with a feature's importance. This paper introduces an
extension of a feature importance explanation method, Calibrated Explanations
(CE), previously only supporting classification, with support for standard
regression and probabilistic regression, i.e., the probability that the target
is above an arbitrary threshold. The extension for regression keeps all the
benefits of CE, such as calibration of the prediction from the underlying model
with confidence intervals, uncertainty quantification of feature importance,
and allows both factual and counterfactual explanations. CE for standard
regression provides fast, reliable, stable, and robust explanations. CE for
probabilistic regression provides an entirely new way of creating probabilistic
explanations from any ordinary regression model and with a dynamic selection of
thresholds. The performance of CE for probabilistic regression regarding
stability and speed is comparable to LIME. The method is model agnostic with
easily understood conditional rules. An implementation in Python is freely
available on GitHub and for installation using pip making the results in this
paper easily replicable.Comment: 30 pages, 11 figures (replaced due to omitted author, which is the
only change made
Conformal Rule-Based Multi-label Classification
We advocate the use of conformal prediction (CP) to enhance rule-based
multi-label classification (MLC). In particular, we highlight the mutual
benefit of CP and rule learning: Rules have the ability to provide natural
(non-)conformity scores, which are required by CP, while CP suggests a way to
calibrate the assessment of candidate rules, thereby supporting better
predictions and more elaborate decision making. We illustrate the potential
usefulness of calibrated conformity scores in a case study on lazy multi-label
rule learning
- …