3 research outputs found
Hard Regularization to Prevent Collapse in Online Deep Clustering without Data Augmentation
Online deep clustering refers to the joint use of a feature extraction
network and a clustering model to assign cluster labels to each new data point
or batch as it is processed. While faster and more versatile than offline
methods, online clustering can easily reach the collapsed solution where the
encoder maps all inputs to the same point and all are put into a single
cluster. Successful existing models have employed various techniques to avoid
this problem, most of which require data augmentation or which aim to make the
average soft assignment across the dataset the same for each cluster. We
propose a method that does not require data augmentation, and that, differently
from existing methods, regularizes the hard assignments. Using a Bayesian
framework, we derive an intuitive optimization objective that can be
straightforwardly included in the training of the encoder network. Tested on
four image datasets, we show that it consistently avoids collapse more robustly
than other methods and that it leads to more accurate clustering. We also
conduct further experiments and analyses justifying our choice to regularize
the hard cluster assignments