Deep neural networks (DNNs) have made great strides in pushing the
state-of-the-art in several challenging domains. Recent studies reveal that
they are prone to making overconfident predictions. This greatly reduces the
overall trust in model predictions, especially in safety-critical applications.
Early work in improving model calibration employs post-processing techniques
which rely on limited parameters and require a hold-out set. Some recent
train-time calibration methods, which involve all model parameters, can
outperform the postprocessing methods. To this end, we propose a new train-time
calibration method, which features a simple, plug-and-play auxiliary loss known
as multi-class alignment of predictive mean confidence and predictive certainty
(MACC). It is based on the observation that a model miscalibration is directly
related to its predictive certainty, so a higher gap between the mean
confidence and certainty amounts to a poor calibration both for in-distribution
and out-of-distribution predictions. Armed with this insight, our proposed loss
explicitly encourages a confident (or underconfident) model to also provide a
low (or high) spread in the presoftmax distribution. Extensive experiments on
ten challenging datasets, covering in-domain, out-domain, non-visual
recognition and medical image classification scenarios, show that our method
achieves state-of-the-art calibration performance for both in-domain and
out-domain predictions. Our code and models will be publicly released.Comment: Accepted at GCPR 202