Many real-world recognition problems suffer from an imbalanced or long-tailed
label distribution. Those distributions make representation learning more
challenging due to limited generalization over the tail classes. If the test
distribution differs from the training distribution, e.g. uniform versus
long-tailed, the problem of the distribution shift needs to be addressed. To
this aim, recent works have extended softmax cross-entropy using margin
modifications, inspired by Bayes' theorem. In this paper, we generalize several
approaches with a Balanced Product of Experts (BalPoE), which combines a family
of models with different test-time target distributions to tackle the imbalance
in the data. The proposed experts are trained in a single stage, either jointly
or independently, and fused seamlessly into a BalPoE. We show that BalPoE is
Fisher consistent for minimizing the balanced error and perform extensive
experiments to validate the effectiveness of our approach. Finally, we
investigate the effect of Mixup in this setting, discovering that
regularization is a key ingredient for learning calibrated experts. Our
experiments show that a regularized BalPoE can perform remarkably well in test
accuracy and calibration metrics, leading to state-of-the-art results on
CIFAR-100-LT, ImageNet-LT, and iNaturalist-2018 datasets. The code will be made
publicly available upon paper acceptance.Comment: 19 pages, under revie