Recent work has observed an intriguing ''Neural Collapse'' phenomenon in
well-trained neural networks, where the last-layer representations of training
samples with the same label collapse into each other. This appears to suggest
that the last-layer representations are completely determined by the labels,
and do not depend on the intrinsic structure of input distribution. We provide
evidence that this is not a complete description, and that the apparent
collapse hides important fine-grained structure in the representations.
Specifically, even when representations apparently collapse, the small amount
of remaining variation can still faithfully and accurately captures the
intrinsic structure of input distribution. As an example, if we train on
CIFAR-10 using only 5 coarse-grained labels (by combining two classes into one
super-class) until convergence, we can reconstruct the original 10-class labels
from the learned representations via unsupervised clustering. The reconstructed
labels achieve 93% accuracy on the CIFAR-10 test set, nearly matching the
normal CIFAR-10 accuracy for the same architecture. We also provide an initial
theoretical result showing the fine-grained representation structure in a
simplified synthetic setting. Our results show concretely how the structure of
input data can play a significant role in determining the fine-grained
structure of neural representations, going beyond what Neural Collapse
predicts.Comment: This paper has been accepted as a conference paper at ICML 202