1,105 research outputs found
Learning to Generalize: Meta-Learning for Domain Generalization
Domain shift refers to the well known problem that a model trained in one
source domain performs poorly when applied to a target domain with different
statistics. {Domain Generalization} (DG) techniques attempt to alleviate this
issue by producing models which by design generalize well to novel testing
domains. We propose a novel {meta-learning} method for domain generalization.
Rather than designing a specific model that is robust to domain shift as in
most previous DG work, we propose a model agnostic training procedure for DG.
Our algorithm simulates train/test domain shift during training by synthesizing
virtual testing domains within each mini-batch. The meta-optimization objective
requires that steps to improve training domain performance should also improve
testing domain performance. This meta-learning procedure trains models with
good generalization ability to novel domains. We evaluate our method and
achieve state of the art results on a recent cross-domain image classification
benchmark, as well demonstrating its potential on two classic reinforcement
learning tasks.Comment: 8 pages, 2 figures, under review of AAAI 201
Deeper, Broader and Artier Domain Generalization
The problem of domain generalization is to learn from multiple training
domains, and extract a domain-agnostic model that can then be applied to an
unseen domain. Domain generalization (DG) has a clear motivation in contexts
where there are target domains with distinct characteristics, yet sparse data
for training. For example recognition in sketch images, which are distinctly
more abstract and rarer than photos. Nevertheless, DG methods have primarily
been evaluated on photo-only benchmarks focusing on alleviating the dataset
bias where both problems of domain distinctiveness and data sparsity can be
minimal. We argue that these benchmarks are overly straightforward, and show
that simple deep learning baselines perform surprisingly well on them. In this
paper, we make two main contributions: Firstly, we build upon the favorable
domain shift-robust properties of deep learning methods, and develop a low-rank
parameterized CNN model for end-to-end DG learning. Secondly, we develop a DG
benchmark dataset covering photo, sketch, cartoon and painting domains. This is
both more practically relevant, and harder (bigger domain shift) than existing
benchmarks. The results show that our method outperforms existing DG
alternatives, and our dataset provides a more significant DG challenge to drive
future research.Comment: 9 pages, 4 figures, ICCV 201
Robust Target Training for Multi-Source Domain Adaptation
Given multiple labeled source domains and a single target domain, most
existing multi-source domain adaptation (MSDA) models are trained on data from
all domains jointly in one step. Such an one-step approach limits their ability
to adapt to the target domain. This is because the training set is dominated by
the more numerous and labeled source domain data. The source-domain-bias can
potentially be alleviated by introducing a second training step, where the
model is fine-tuned with the unlabeled target domain data only using pseudo
labels as supervision. However, the pseudo labels are inevitably noisy and when
used unchecked can negatively impact the model performance. To address this
problem, we propose a novel Bi-level Optimization based Robust Target Training
(BORT) method for MSDA. Given any existing fully-trained one-step MSDA
model, BORT turns it to a labeling function to generate pseudo-labels for
the target data and trains a target model using pseudo-labeled target data
only. Crucially, the target model is a stochastic CNN which is designed to be
intrinsically robust against label noise generated by the labeling function.
Such a stochastic CNN models each target instance feature as a Gaussian
distribution with an entropy maximization regularizer deployed to measure the
label uncertainty, which is further exploited to alleviate the negative impact
of noisy pseudo labels. Training the labeling function and the target model
poses a nested bi-level optimization problem, for which we formulate an elegant
solution based on implicit differentiation. Extensive experiments demonstrate
that our proposed method achieves the state of the art performance on three
MSDA benchmarks, including the large-scale DomainNet dataset. Our code will be
available at \url{https://github.com/Zhongying-Deng/BORT2}Comment: Accepted to BMVC 202
- …