49,191 research outputs found
Learning to Generate Novel Domains for Domain Generalization
This paper focuses on domain generalization (DG), the task of learning from
multiple source domains a model that generalizes well to unseen domains. A main
challenge for DG is that the available source domains often exhibit limited
diversity, hampering the model's ability to learn to generalize. We therefore
employ a data generator to synthesize data from pseudo-novel domains to augment
the source domains. This explicitly increases the diversity of available
training domains and leads to a more generalizable model. To train the
generator, we model the distribution divergence between source and synthesized
pseudo-novel domains using optimal transport, and maximize the divergence. To
ensure that semantics are preserved in the synthesized data, we further impose
cycle-consistency and classification losses on the generator. Our method,
L2A-OT (Learning to Augment by Optimal Transport) outperforms current
state-of-the-art DG methods on four benchmark datasets.Comment: To appear in ECCV'2
Learning Domain Invariant Prompt for Vision-Language Models
Prompt learning is one of the most effective and trending ways to adapt
powerful vision-language foundation models like CLIP to downstream datasets by
tuning learnable prompt vectors with very few samples. However, although prompt
learning achieves excellent performance over in-domain data, it still faces the
major challenge of generalizing to unseen classes and domains. Some existing
prompt learning methods tackle this issue by adaptively generating different
prompts for different tokens or domains but neglecting the ability of learned
prompts to generalize to unseen domains. In this paper, we propose a novel
prompt learning paradigm that directly generates \emph{domain invariant} prompt
that can be generalized to unseen domains, called MetaPrompt. Specifically, a
dual-modality prompt tuning network is proposed to generate prompts for input
from both image and text modalities. With a novel asymmetric contrastive loss,
the representation from the original pre-trained vision-language model acts as
supervision to enhance the generalization ability of the learned prompt. More
importantly, we propose a meta-learning-based prompt tuning algorithm that
explicitly constrains the task-specific prompt tuned for one domain or class to
also achieve good performance in another domain or class. Extensive experiments
on 11 datasets for base-to-new generalization and 4 datasets for domain
generalization demonstrate that our method consistently and significantly
outperforms existing methods.Comment: 12 pages, 6 figures, 5 table
Quantitatively Measuring and Contrastively Exploring Heterogeneity for Domain Generalization
Domain generalization (DG) is a prevalent problem in real-world applications,
which aims to train well-generalized models for unseen target domains by
utilizing several source domains. Since domain labels, i.e., which domain each
data point is sampled from, naturally exist, most DG algorithms treat them as a
kind of supervision information to improve the generalization performance.
However, the original domain labels may not be the optimal supervision signal
due to the lack of domain heterogeneity, i.e., the diversity among domains. For
example, a sample in one domain may be closer to another domain, its original
label thus can be the noise to disturb the generalization learning. Although
some methods try to solve it by re-dividing domains and applying the newly
generated dividing pattern, the pattern they choose may not be the most
heterogeneous due to the lack of the metric for heterogeneity. In this paper,
we point out that domain heterogeneity mainly lies in variant features under
the invariant learning framework. With contrastive learning, we propose a
learning potential-guided metric for domain heterogeneity by promoting learning
variant features. Then we notice the differences between seeking variance-based
heterogeneity and training invariance-based generalizable model. We thus
propose a novel method called Heterogeneity-based Two-stage Contrastive
Learning (HTCL) for the DG task. In the first stage, we generate the most
heterogeneous dividing pattern with our contrastive metric. In the second
stage, we employ an invariance-aimed contrastive learning by re-building pairs
with the stable relation hinted by domains and classes, which better utilizes
generated domain labels for generalization learning. Extensive experiments show
HTCL better digs heterogeneity and yields great generalization performance.Comment: This paper has been accepted by KDD 202
Unsupervised Domain Adaptation on Reading Comprehension
Reading comprehension (RC) has been studied in a variety of datasets with the
boosted performance brought by deep neural networks. However, the
generalization capability of these models across different domains remains
unclear. To alleviate this issue, we are going to investigate unsupervised
domain adaptation on RC, wherein a model is trained on labeled source domain
and to be applied to the target domain with only unlabeled samples. We first
show that even with the powerful BERT contextual representation, the
performance is still unsatisfactory when the model trained on one dataset is
directly applied to another target dataset. To solve this, we provide a novel
conditional adversarial self-training method (CASe). Specifically, our approach
leverages a BERT model fine-tuned on the source dataset along with the
confidence filtering to generate reliable pseudo-labeled samples in the target
domain for self-training. On the other hand, it further reduces domain
distribution discrepancy through conditional adversarial learning across
domains. Extensive experiments show our approach achieves comparable accuracy
to supervised models on multiple large-scale benchmark datasets.Comment: 8 pages, 6 figures, 5 tables, Accepted by AAAI 202
Domain Generalization via Balancing Training Difficulty and Model Capability
Domain generalization (DG) aims to learn domain-generalizable models from one
or multiple source domains that can perform well in unseen target domains.
Despite its recent progress, most existing work suffers from the misalignment
between the difficulty level of training samples and the capability of
contemporarily trained models, leading to over-fitting or under-fitting in the
trained generalization model. We design MoDify, a Momentum Difficulty framework
that tackles the misalignment by balancing the seesaw between the model's
capability and the samples' difficulties along the training process. MoDify
consists of two novel designs that collaborate to fight against the
misalignment while learning domain-generalizable models. The first is
MoDify-based Data Augmentation which exploits an RGB Shuffle technique to
generate difficulty-aware training samples on the fly. The second is
MoDify-based Network Optimization which dynamically schedules the training
samples for balanced and smooth learning with appropriate difficulty. Without
bells and whistles, a simple implementation of MoDify achieves superior
performance across multiple benchmarks. In addition, MoDify can complement
existing methods as a plug-in, and it is generic and can work for different
visual recognition tasks.Comment: 11 pages, 6 figures, Accepted by ICCV 202
Adversarial Discriminative Domain Adaptation
Adversarial learning methods are a promising approach to training robust deep
networks, and can generate complex samples across diverse domains. They also
can improve recognition despite the presence of domain shift or dataset bias:
several adversarial approaches to unsupervised domain adaptation have recently
been introduced, which reduce the difference between the training and test
domain distributions and thus improve generalization performance. Prior
generative approaches show compelling visualizations, but are not optimal on
discriminative tasks and can be limited to smaller shifts. Prior discriminative
approaches could handle larger domain shifts, but imposed tied weights on the
model and did not exploit a GAN-based loss. We first outline a novel
generalized framework for adversarial adaptation, which subsumes recent
state-of-the-art approaches as special cases, and we use this generalized view
to better relate the prior approaches. We propose a previously unexplored
instance of our general framework which combines discriminative modeling,
untied weight sharing, and a GAN loss, which we call Adversarial Discriminative
Domain Adaptation (ADDA). We show that ADDA is more effective yet considerably
simpler than competing domain-adversarial methods, and demonstrate the promise
of our approach by exceeding state-of-the-art unsupervised adaptation results
on standard cross-domain digit classification tasks and a new more difficult
cross-modality object classification task
Multi-component Image Translation for Deep Domain Generalization
Domain adaption (DA) and domain generalization (DG) are two closely related
methods which are both concerned with the task of assigning labels to an
unlabeled data set. The only dissimilarity between these approaches is that DA
can access the target data during the training phase, while the target data is
totally unseen during the training phase in DG. The task of DG is challenging
as we have no earlier knowledge of the target samples. If DA methods are
applied directly to DG by a simple exclusion of the target data from training,
poor performance will result for a given task. In this paper, we tackle the
domain generalization challenge in two ways. In our first approach, we propose
a novel deep domain generalization architecture utilizing synthetic data
generated by a Generative Adversarial Network (GAN). The discrepancy between
the generated images and synthetic images is minimized using existing domain
discrepancy metrics such as maximum mean discrepancy or correlation alignment.
In our second approach, we introduce a protocol for applying DA methods to a DG
scenario by excluding the target data from the training phase, splitting the
source data to training and validation parts, and treating the validation data
as target data for DA. We conduct extensive experiments on four cross-domain
benchmark datasets. Experimental results signify our proposed model outperforms
the current state-of-the-art methods for DG.Comment: Accepted in WACV 201
- …