In this paper, our goal is to adapt a pre-trained convolutional neural
network to domain shifts at test time. We do so continually with the incoming
stream of test batches, without labels. The existing literature mostly operates
on artificial shifts obtained via adversarial perturbations of a test image.
Motivated by this, we evaluate the state of the art on two realistic and
challenging sources of domain shifts, namely contextual and semantic shifts.
Contextual shifts correspond to the environment types, for example, a model
pre-trained on indoor context has to adapt to the outdoor context on CORe-50.
Semantic shifts correspond to the capture types, for example a model
pre-trained on natural images has to adapt to cliparts, sketches, and paintings
on DomainNet. We include in our analysis recent techniques such as
Prediction-Time Batch Normalization (BN), Test Entropy Minimization (TENT) and
Continual Test-Time Adaptation (CoTTA). Our findings are three-fold: i)
Test-time adaptation methods perform better and forget less on contextual
shifts compared to semantic shifts, ii) TENT outperforms other methods on
short-term adaptation, whereas CoTTA outpeforms other methods on long-term
adaptation, iii) BN is most reliable and robust. Our code is available at
https://github.com/tommiekerssies/Evaluating-Continual-Test-Time-Adaptation-for-Contextual-and-Semantic-Domain-Shifts