Unsupervised Domain Adaptation (UDA) aims at classifying unlabeled target
images leveraging source labeled ones. In this work, we consider the Partial
Domain Adaptation (PDA) variant, where we have extra source classes not present
in the target domain. Most successful algorithms use model selection strategies
that rely on target labels to find the best hyper-parameters and/or models
along training. However, these strategies violate the main assumption in PDA:
only unlabeled target domain samples are available. Moreover, there are also
inconsistencies in the experimental settings - architecture, hyper-parameter
tuning, number of runs - yielding unfair comparisons. The main goal of this
work is to provide a realistic evaluation of PDA methods with the different
model selection strategies under a consistent evaluation protocol. We evaluate
7 representative PDA algorithms on 2 different real-world datasets using 7
different model selection strategies. Our two main findings are: (i) without
target labels for model selection, the accuracy of the methods decreases up to
30 percentage points; (ii) only one method and model selection pair performs
well on both datasets. Experiments were performed with our PyTorch framework,
BenchmarkPDA, which we open source.Comment: 17 pages, 13 table