48 research outputs found
A t-distribution based operator for enhancing out of distribution robustness of neural network classifiers
Neural Network (NN) classifiers can assign extreme probabilities to samples
that have not appeared during training (out-of-distribution samples) resulting
in erroneous and unreliable predictions. One of the causes for this unwanted
behaviour lies in the use of the standard softmax operator which pushes the
posterior probabilities to be either zero or unity hence failing to model
uncertainty. The statistical derivation of the softmax operator relies on the
assumption that the distributions of the latent variables for a given class are
Gaussian with known variance. However, it is possible to use different
assumptions in the same derivation and attain from other families of
distributions as well. This allows derivation of novel operators with more
favourable properties. Here, a novel operator is proposed that is derived using
-distributions which are capable of providing a better description of
uncertainty. It is shown that classifiers that adopt this novel operator can be
more robust to out of distribution samples, often outperforming NNs that use
the standard softmax operator. These enhancements can be reached with minimal
changes to the NN architecture.Comment: 5 pages, 5 figures, to be published in IEEE Signal Processing
Letters, reproducible code https://github.com/idiap/tsoftma
A Flexible and Adaptive Framework for Abstention Under Class Imbalance
In practical applications of machine learning, it is often desirable to
identify and abstain on examples where the model's predictions are likely to be
incorrect. Much of the prior work on this topic focused on out-of-distribution
detection or performance metrics such as top-k accuracy. Comparatively little
attention was given to metrics such as area-under-the-curve or Cohen's Kappa,
which are extremely relevant for imbalanced datasets. Abstention strategies
aimed at top-k accuracy can produce poor results on these metrics when applied
to imbalanced datasets, even when all examples are in-distribution. We propose
a framework to address this gap. Our framework leverages the insight that
calibrated probability estimates can be used as a proxy for the true class
labels, thereby allowing us to estimate the change in an arbitrary metric if an
example were abstained on. Using this framework, we derive computationally
efficient metric-specific abstention algorithms for optimizing the sensitivity
at a target specificity level, the area under the ROC, and the weighted Cohen's
Kappa. Because our method relies only on calibrated probability estimates, we
further show that by leveraging recent work on domain adaptation under label
shift, we can generalize to test-set distributions that may have a different
class imbalance compared to the training set distribution. On various
experiments involving medical imaging, natural language processing, computer
vision and genomics, we demonstrate the effectiveness of our approach. Source
code available at https://github.com/blindauth/abstention. Colab notebooks
reproducing results available at
https://github.com/blindauth/abstention_experiments