89 research outputs found
A Flexible and Adaptive Framework for Abstention Under Class Imbalance
In practical applications of machine learning, it is often desirable to
identify and abstain on examples where the model's predictions are likely to be
incorrect. Much of the prior work on this topic focused on out-of-distribution
detection or performance metrics such as top-k accuracy. Comparatively little
attention was given to metrics such as area-under-the-curve or Cohen's Kappa,
which are extremely relevant for imbalanced datasets. Abstention strategies
aimed at top-k accuracy can produce poor results on these metrics when applied
to imbalanced datasets, even when all examples are in-distribution. We propose
a framework to address this gap. Our framework leverages the insight that
calibrated probability estimates can be used as a proxy for the true class
labels, thereby allowing us to estimate the change in an arbitrary metric if an
example were abstained on. Using this framework, we derive computationally
efficient metric-specific abstention algorithms for optimizing the sensitivity
at a target specificity level, the area under the ROC, and the weighted Cohen's
Kappa. Because our method relies only on calibrated probability estimates, we
further show that by leveraging recent work on domain adaptation under label
shift, we can generalize to test-set distributions that may have a different
class imbalance compared to the training set distribution. On various
experiments involving medical imaging, natural language processing, computer
vision and genomics, we demonstrate the effectiveness of our approach. Source
code available at https://github.com/blindauth/abstention. Colab notebooks
reproducing results available at
https://github.com/blindauth/abstention_experiments
Recommended from our members
ReG-Rules: an explainable rule-based ensemble learner for classification
The learning of classification models to predict class labels of new and previously unseen data instances is one of the most essential tasks in data mining. A popular approach to classification is ensemble learning, where a combination of several diverse and independent classification models is used to predict class labels. Ensemble models are important as they tend to improve the average classification accuracy over any member of the ensemble. However, classification models are also often required to be explainable to reduce the risk of irreversible wrong classification. Explainability of classification models is needed in many critical applications such as stock market analysis, credit risk evaluation, intrusion detection, etc. Unfortunately, ensemble learning decreases the level of explainability of the classification, as the analyst would have to examine many decision models to gain insights about the causality of the prediction. The aim of the research presented in this paper is to create an ensemble method that is explainable in the sense that it presents the human analyst with a conditioned view of the most relevant model aspects involved in the prediction. To achieve this aim the authors developed a rule-based explainable ensemble classifier termed Ranked ensemble G-Rules (ReG-Rules) which gives the analyst an extract of the most relevant classification rules for each individual prediction. During the evaluation process ReG-Rules was evaluated in terms of its theoretical computational complexity, empirically on benchmark datasets and qualitatively with respect to the complexity and readability of the induced rule sets. The results show that ReG-Rules scales linearly, delivers a high accuracy and at the same time delivers a compact and manageable set of rules describing the predictions made
Machine Learning with a Reject Option: A survey
Machine learning models always make a prediction, even when it is likely to
be inaccurate. This behavior should be avoided in many decision support
applications, where mistakes can have severe consequences. Albeit already
studied in 1970, machine learning with rejection recently gained interest. This
machine learning subfield enables machine learning models to abstain from
making a prediction when likely to make a mistake.
This survey aims to provide an overview on machine learning with rejection.
We introduce the conditions leading to two types of rejection, ambiguity and
novelty rejection, which we carefully formalize. Moreover, we review and
categorize strategies to evaluate a model's predictive and rejective quality.
Additionally, we define the existing architectures for models with rejection
and describe the standard techniques for learning such models. Finally, we
provide examples of relevant application domains and show how machine learning
with rejection relates to other machine learning research areas
Information-Theoretic Measures for Objective Evaluation of Classifications
This work presents a systematic study of objective evaluations of abstaining
classifications using Information-Theoretic Measures (ITMs). First, we define
objective measures for which they do not depend on any free parameter. This
definition provides technical simplicity for examining "objectivity" or
"subjectivity" directly to classification evaluations. Second, we propose
twenty four normalized ITMs, derived from either mutual information,
divergence, or cross-entropy, for investigation. Contrary to conventional
performance measures that apply empirical formulas based on users' intuitions
or preferences, the ITMs are theoretically more sound for realizing objective
evaluations of classifications. We apply them to distinguish "error types" and
"reject types" in binary classifications without the need for input data of
cost terms. Third, to better understand and select the ITMs, we suggest three
desirable features for classification assessment measures, which appear more
crucial and appealing from the viewpoint of classification applications. Using
these features as "meta-measures", we can reveal the advantages and limitations
of ITMs from a higher level of evaluation knowledge. Numerical examples are
given to corroborate our claims and compare the differences among the proposed
measures. The best measure is selected in terms of the meta-measures, and its
specific properties regarding error types and reject types are analytically
derived.Comment: 25 Pages, 1 Figure, 10 Table
- …