1,105 research outputs found

    Learning to Generalize: Meta-Learning for Domain Generalization

    Get PDF
    Domain shift refers to the well known problem that a model trained in one source domain performs poorly when applied to a target domain with different statistics. {Domain Generalization} (DG) techniques attempt to alleviate this issue by producing models which by design generalize well to novel testing domains. We propose a novel {meta-learning} method for domain generalization. Rather than designing a specific model that is robust to domain shift as in most previous DG work, we propose a model agnostic training procedure for DG. Our algorithm simulates train/test domain shift during training by synthesizing virtual testing domains within each mini-batch. The meta-optimization objective requires that steps to improve training domain performance should also improve testing domain performance. This meta-learning procedure trains models with good generalization ability to novel domains. We evaluate our method and achieve state of the art results on a recent cross-domain image classification benchmark, as well demonstrating its potential on two classic reinforcement learning tasks.Comment: 8 pages, 2 figures, under review of AAAI 201

    Deeper, Broader and Artier Domain Generalization

    Get PDF
    The problem of domain generalization is to learn from multiple training domains, and extract a domain-agnostic model that can then be applied to an unseen domain. Domain generalization (DG) has a clear motivation in contexts where there are target domains with distinct characteristics, yet sparse data for training. For example recognition in sketch images, which are distinctly more abstract and rarer than photos. Nevertheless, DG methods have primarily been evaluated on photo-only benchmarks focusing on alleviating the dataset bias where both problems of domain distinctiveness and data sparsity can be minimal. We argue that these benchmarks are overly straightforward, and show that simple deep learning baselines perform surprisingly well on them. In this paper, we make two main contributions: Firstly, we build upon the favorable domain shift-robust properties of deep learning methods, and develop a low-rank parameterized CNN model for end-to-end DG learning. Secondly, we develop a DG benchmark dataset covering photo, sketch, cartoon and painting domains. This is both more practically relevant, and harder (bigger domain shift) than existing benchmarks. The results show that our method outperforms existing DG alternatives, and our dataset provides a more significant DG challenge to drive future research.Comment: 9 pages, 4 figures, ICCV 201

    Robust Target Training for Multi-Source Domain Adaptation

    Full text link
    Given multiple labeled source domains and a single target domain, most existing multi-source domain adaptation (MSDA) models are trained on data from all domains jointly in one step. Such an one-step approach limits their ability to adapt to the target domain. This is because the training set is dominated by the more numerous and labeled source domain data. The source-domain-bias can potentially be alleviated by introducing a second training step, where the model is fine-tuned with the unlabeled target domain data only using pseudo labels as supervision. However, the pseudo labels are inevitably noisy and when used unchecked can negatively impact the model performance. To address this problem, we propose a novel Bi-level Optimization based Robust Target Training (BORT2^2) method for MSDA. Given any existing fully-trained one-step MSDA model, BORT2^2 turns it to a labeling function to generate pseudo-labels for the target data and trains a target model using pseudo-labeled target data only. Crucially, the target model is a stochastic CNN which is designed to be intrinsically robust against label noise generated by the labeling function. Such a stochastic CNN models each target instance feature as a Gaussian distribution with an entropy maximization regularizer deployed to measure the label uncertainty, which is further exploited to alleviate the negative impact of noisy pseudo labels. Training the labeling function and the target model poses a nested bi-level optimization problem, for which we formulate an elegant solution based on implicit differentiation. Extensive experiments demonstrate that our proposed method achieves the state of the art performance on three MSDA benchmarks, including the large-scale DomainNet dataset. Our code will be available at \url{https://github.com/Zhongying-Deng/BORT2}Comment: Accepted to BMVC 202

    Now You See Me: Deep Face Hallucination for Unviewed Sketches

    Get PDF
    • …
    corecore