Test-Time Adaptation aims to adapt source domain model to testing data at
inference stage with success demonstrated in adapting to unseen corruptions.
However, these attempts may fail under more challenging real-world scenarios.
Existing works mainly consider real-world test-time adaptation under non-i.i.d.
data stream and continual domain shift. In this work, we first complement the
existing real-world TTA protocol with a globally class imbalanced testing set.
We demonstrate that combining all settings together poses new challenges to
existing methods. We argue the failure of state-of-the-art methods is first
caused by indiscriminately adapting normalization layers to imbalanced testing
data. To remedy this shortcoming, we propose a balanced batchnorm layer to swap
out the regular batchnorm at inference stage. The new batchnorm layer is
capable of adapting without biasing towards majority classes. We are further
inspired by the success of self-training~(ST) in learning from unlabeled data
and adapt ST for test-time adaptation. However, ST alone is prone to over
adaption which is responsible for the poor performance under continual domain
shift. Hence, we propose to improve self-training under continual domain shift
by regularizing model updates with an anchored loss. The final TTA model,
termed as TRIBE, is built upon a tri-net architecture with balanced batchnorm
layers. We evaluate TRIBE on four datasets representing real-world TTA
settings. TRIBE consistently achieves the state-of-the-art performance across
multiple evaluation protocols. The code is available at
\url{https://github.com/Gorilla-Lab-SCUT/TRIBE}.Comment: 23 pages, 7 figures and 22 table