Despite the recent success of deep-learning based semantic segmentation,
deploying a pre-trained road scene segmenter to a city whose images are not
presented in the training set would not achieve satisfactory performance due to
dataset biases. Instead of collecting a large number of annotated images of
each city of interest to train or refine the segmenter, we propose an
unsupervised learning approach to adapt road scene segmenters across different
cities. By utilizing Google Street View and its time-machine feature, we can
collect unannotated images for each road scene at different times, so that the
associated static-object priors can be extracted accordingly. By advancing a
joint global and class-specific domain adversarial learning framework,
adaptation of pre-trained segmenters to that city can be achieved without the
need of any user annotation or interaction. We show that our method improves
the performance of semantic segmentation in multiple cities across continents,
while it performs favorably against state-of-the-art approaches requiring
annotated training data.Comment: 13 pages, 10 figure