In computer vision, it is often observed that formulating regression problems
as a classification task often yields better performance. We investigate this
curious phenomenon and provide a derivation to show that classification, with
the cross-entropy loss, outperforms regression with a mean squared error loss
in its ability to learn high-entropy feature representations. Based on the
analysis, we propose an ordinal entropy loss to encourage higher-entropy
feature spaces while maintaining ordinal relationships to improve the
performance of regression tasks. Experiments on synthetic and real-world
regression tasks demonstrate the importance and benefits of increasing entropy
for regression.Comment: Accepted to ICLR 2023. Project page:
https://github.com/needylove/OrdinalEntrop