11,994 research outputs found
Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond
The vast majority of existing algorithms for unsupervised domain adaptation
(UDA) focus on adapting from a labeled source domain to an unlabeled target
domain directly in a one-off way. Gradual domain adaptation (GDA), on the other
hand, assumes a path of unlabeled intermediate domains bridging the
source and target, and aims to provide better generalization in the target
domain by leveraging the intermediate ones. Under certain assumptions, Kumar et
al. (2020) proposed a simple algorithm, Gradual Self-Training, along with a
generalization bound in the order of for the target domain
error, where is the source domain error and is the data
size of each domain. Due to the exponential factor, this upper bound becomes
vacuous when is only moderately large. In this work, we analyze gradual
self-training under more general and relaxed assumptions, and prove a
significantly improved generalization bound as
,
where is the average distributional distance between consecutive
domains. Compared with the existing bound with an exponential dependency on
as a multiplicative factor, our bound only depends on linearly and
additively. Perhaps more interestingly, our result implies the existence of an
optimal choice of that minimizes the generalization error, and it also
naturally suggests an optimal way to construct the path of intermediate domains
so as to minimize the accumulative path length between the source and
target. To corroborate the implications of our theory, we examine gradual
self-training on multiple semi-synthetic and real datasets, which confirms our
findings. We believe our insights provide a path forward toward the design of
future GDA algorithms.Comment: The code will be released at
https://github.com/Haoxiang-Wang/gradual-domain-adaptatio
PAC-Bayesian Domain Adaptation Bounds for Multiclass Learners
Multiclass neural networks are a common tool in modern unsupervised domain
adaptation, yet an appropriate theoretical description for their non-uniform
sample complexity is lacking in the adaptation literature. To fill this gap, we
propose the first PAC-Bayesian adaptation bounds for multiclass learners. We
facilitate practical use of our bounds by also proposing the first
approximation techniques for the multiclass distribution divergences we
consider. For divergences dependent on a Gibbs predictor, we propose additional
PAC-Bayesian adaptation bounds which remove the need for inefficient
Monte-Carlo estimation. Empirically, we test the efficacy of our proposed
approximation techniques as well as some novel design-concepts which we include
in our bounds. Finally, we apply our bounds to analyze a common adaptation
algorithm that uses neural networks
A Survey on Negative Transfer
Transfer learning (TL) tries to utilize data or knowledge from one or more
source domains to facilitate the learning in a target domain. It is
particularly useful when the target domain has few or no labeled data, due to
annotation expense, privacy concerns, etc. Unfortunately, the effectiveness of
TL is not always guaranteed. Negative transfer (NT), i.e., the source domain
data/knowledge cause reduced learning performance in the target domain, has
been a long-standing and challenging problem in TL. Various approaches to
handle NT have been proposed in the literature. However, this filed lacks a
systematic survey on the formalization of NT, their factors and the algorithms
that handle NT. This paper proposes to fill this gap. First, the definition of
negative transfer is considered and a taxonomy of the factors are discussed.
Then, near fifty representative approaches for handling NT are categorized and
reviewed, from four perspectives: secure transfer, domain similarity
estimation, distant transfer and negative transfer mitigation. NT in related
fields, e.g., multi-task learning, lifelong learning, and adversarial attacks
are also discussed
- β¦