85 research outputs found
Collision Cross-entropy for Soft Class Labels and Deep Clustering
We propose "collision cross-entropy" as a robust alternative to Shannon's
cross-entropy (CE) loss when class labels are represented by soft categorical
distributions y. In general, soft labels can naturally represent ambiguous
targets in classification. They are particularly relevant for self-labeled
clustering methods, where latent pseudo-labels are jointly estimated with the
model parameters and uncertainty is prevalent. In case of soft labels,
Shannon's CE teaches the model predictions to reproduce the uncertainty in each
training example, which inhibits the model's ability to learn and generalize
from these examples. As an alternative loss, we propose the negative log of
"collision probability" that maximizes the chance of equality between two
random variables, predicted class and unknown true class. We show that it has
the properties of a generalized CE. The proposed collision CE agrees with
Shannon's CE for one-hot labels, but the training from soft labels differs. For
example, unlike Shannon's CE, data points where y is a uniform distribution
have zero contribution to the training. Collision CE significantly improves
classification supervised by soft uncertain targets. Unlike Shannon's,
collision CE is symmetric for y and network predictions, which is particularly
relevant when both distributions are estimated in the context of self-labeled
clustering. Focusing on discriminative deep clustering where self-labeling and
entropy-based losses are dominant, we show that the use of collision CE
improves the state-of-the-art. We also derive an efficient EM algorithm that
significantly speeds up the pseudo-label estimation with collision CE
- …