Real-world data often exhibit imbalanced label distributions. Existing
studies on data imbalance focus on single-domain settings, i.e., samples are
from the same data distribution. However, natural data can originate from
distinct domains, where a minority class in one domain could have abundant
instances from other domains. We formalize the task of Multi-Domain Long-Tailed
Recognition (MDLT), which learns from multi-domain imbalanced data, addresses
label imbalance, domain shift, and divergent label distributions across
domains, and generalizes to all domain-class pairs. We first develop the
domain-class transferability graph, and show that such transferability governs
the success of learning in MDLT. We then propose BoDA, a theoretically grounded
learning strategy that tracks the upper bound of transferability statistics,
and ensures balanced alignment and calibration across imbalanced domain-class
distributions. We curate five MDLT benchmarks based on widely-used multi-domain
datasets, and compare BoDA to twenty algorithms that span different learning
strategies. Extensive and rigorous experiments verify the superior performance
of BoDA. Further, as a byproduct, BoDA establishes new state-of-the-art on
Domain Generalization benchmarks, highlighting the importance of addressing
data imbalance across domains, which can be crucial for improving
generalization to unseen domains. Code and data are available at:
https://github.com/YyzHarry/multi-domain-imbalance.Comment: ECCV 202