5,148 research outputs found
Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation
We present a novel unsupervised domain adaptation method for semantic
segmentation that generalizes a model trained with source images and
corresponding ground-truth labels to a target domain. A key to domain adaptive
semantic segmentation is to learn domain-invariant and discriminative features
without target ground-truth labels. To this end, we propose a bi-directional
pixel-prototype contrastive learning framework that minimizes intra-class
variations of features for the same object class, while maximizing inter-class
variations for different ones, regardless of domains. Specifically, our
framework aligns pixel-level features and a prototype of the same object class
in target and source images (i.e., positive pairs), respectively, sets them
apart for different classes (i.e., negative pairs), and performs the alignment
and separation processes toward the other direction with pixel-level features
in the source image and a prototype in the target image. The cross-domain
matching encourages domain-invariant feature representations, while the
bidirectional pixel-prototype correspondences aggregate features for the same
object class, providing discriminative features. To establish training pairs
for contrastive learning, we propose to generate dynamic pseudo labels of
target images using a non-parametric label transfer, that is, pixel-prototype
correspondences across different domains. We also present a calibration method
compensating class-wise domain biases of prototypes gradually during training.Comment: Accepted to ECCV 202
Guided Curriculum Model Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation
Most progress in semantic segmentation reports on daytime images taken under
favorable illumination conditions. We instead address the problem of semantic
segmentation of nighttime images and improve the state-of-the-art, by adapting
daytime models to nighttime without using nighttime annotations. Moreover, we
design a new evaluation framework to address the substantial uncertainty of
semantics in nighttime images. Our central contributions are: 1) a curriculum
framework to gradually adapt semantic segmentation models from day to night via
labeled synthetic images and unlabeled real images, both for progressively
darker times of day, which exploits cross-time-of-day correspondences for the
real images to guide the inference of their labels; 2) a novel
uncertainty-aware annotation and evaluation framework and metric for semantic
segmentation, designed for adverse conditions and including image regions
beyond human recognition capability in the evaluation in a principled fashion;
3) the Dark Zurich dataset, which comprises 2416 unlabeled nighttime and 2920
unlabeled twilight images with correspondences to their daytime counterparts
plus a set of 151 nighttime images with fine pixel-level annotations created
with our protocol, which serves as a first benchmark to perform our novel
evaluation. Experiments show that our guided curriculum adaptation
significantly outperforms state-of-the-art methods on real nighttime sets both
for standard metrics and our uncertainty-aware metric. Furthermore, our
uncertainty-aware evaluation reveals that selective invalidation of predictions
can lead to better results on data with ambiguous content such as our nighttime
benchmark and profit safety-oriented applications which involve invalid inputs.Comment: ICCV 2019 camera-read
ML-BPM: Multi-teacher Learning with Bidirectional Photometric Mixing for Open Compound Domain Adaptation in Semantic Segmentation
Open compound domain adaptation (OCDA) considers the target domain as the
compound of multiple unknown homogeneous subdomains. The goal of OCDA is to
minimize the domain gap between the labeled source domain and the unlabeled
compound target domain, which benefits the model generalization to the unseen
domains. Current OCDA for semantic segmentation methods adopt manual domain
separation and employ a single model to simultaneously adapt to all the target
subdomains. However, adapting to a target subdomain might hinder the model from
adapting to other dissimilar target subdomains, which leads to limited
performance. In this work, we introduce a multi-teacher framework with
bidirectional photometric mixing to separately adapt to every target subdomain.
First, we present an automatic domain separation to find the optimal number of
subdomains. On this basis, we propose a multi-teacher framework in which each
teacher model uses bidirectional photometric mixing to adapt to one target
subdomain. Furthermore, we conduct an adaptive distillation to learn a student
model and apply consistency regularization to improve the student
generalization. Experimental results on benchmark datasets show the efficacy of
the proposed approach for both the compound domain and the open domains against
existing state-of-the-art approaches.Comment: Accepted to ECCV 202
- …