Unsupervised Domain Adaptation~(UDA) has attracted a surge of interest over
the past decade but is difficult to be used in real-world applications.
Considering the privacy-preservation issues and security concerns, in this
work, we study a practical problem of Source-Free Domain Adaptation (SFDA),
which eliminates the reliance on annotated source data. Current SFDA methods
focus on extracting domain knowledge from the source-trained model but neglects
the intrinsic structure of the target domain. Moreover, they typically utilize
pseudo labels for self-training in the target domain, but suffer from the
notorious error accumulation problem. To address these issues, we propose a new
SFDA framework, called Region-to-Pixel Adaptation Network~(RPANet), which
learns the region-level and pixel-level discriminative representations through
coarse-to-fine self-supervision. The proposed RPANet consists of two modules,
Foreground-aware Contrastive Learning (FCL) and Confidence-Calibrated
Pseudo-Labeling (CCPL), which explicitly address the key challenges of ``how to
distinguish'' and ``how to refine''. To be specific, FCL introduces a
supervised contrastive learning paradigm in the region level to contrast
different region centroids across different target images, which efficiently
involves all pseudo labels while robust to noisy samples. CCPL designs a novel
fusion strategy to reduce the overconfidence problem of pseudo labels by fusing
two different target predictions without introducing any additional network
modules. Extensive experiments on three cross-domain polyp segmentation tasks
reveal that RPANet significantly outperforms state-of-the-art SFDA and UDA
methods without access to source data, revealing the potential of SFDA in
medical applications.Comment: Accepted by IPMI 202