Search CORE

11,994 research outputs found

Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond

Author: Li Bo
Wang Haoxiang
Zhao Han
Publication venue
Publication date: 18/04/2022
Field of study

The vast majority of existing algorithms for unsupervised domain adaptation (UDA) focus on adapting from a labeled source domain to an unlabeled target domain directly in a one-off way. Gradual domain adaptation (GDA), on the other hand, assumes a path of

(T-1)

unlabeled intermediate domains bridging the source and target, and aims to provide better generalization in the target domain by leveraging the intermediate ones. Under certain assumptions, Kumar et al. (2020) proposed a simple algorithm, Gradual Self-Training, along with a generalization bound in the order of

e^{O(T)} \left(\varepsilon_0+O\left(\sqrt{log(T)/n}\right)\right)

for the target domain error, where

\varepsilon_0

is the source domain error and

n

is the data size of each domain. Due to the exponential factor, this upper bound becomes vacuous when

T

is only moderately large. In this work, we analyze gradual self-training under more general and relaxed assumptions, and prove a significantly improved generalization bound as

\widetilde{O}\left(\varepsilon_0 + T\Delta + T/\sqrt{n} + 1/\sqrt{nT}\right)

, where

\Delta

is the average distributional distance between consecutive domains. Compared with the existing bound with an exponential dependency on

T

as a multiplicative factor, our bound only depends on

T

linearly and additively. Perhaps more interestingly, our result implies the existence of an optimal choice of

T

that minimizes the generalization error, and it also naturally suggests an optimal way to construct the path of intermediate domains so as to minimize the accumulative path length

T\Delta

between the source and target. To corroborate the implications of our theory, we examine gradual self-training on multiple semi-synthetic and real datasets, which confirms our findings. We believe our insights provide a path forward toward the design of future GDA algorithms.Comment: The code will be released at https://github.com/Haoxiang-Wang/gradual-domain-adaptatio

arXiv.org e-Print Archive

PAC-Bayesian Domain Adaptation Bounds for Multiclass Learners

Author: Alikhani Malihe
Atwell Katherine
Hwang Seong Jae
Sicilia Anthony
Publication venue
Publication date: 12/07/2022
Field of study

Multiclass neural networks are a common tool in modern unsupervised domain adaptation, yet an appropriate theoretical description for their non-uniform sample complexity is lacking in the adaptation literature. To fill this gap, we propose the first PAC-Bayesian adaptation bounds for multiclass learners. We facilitate practical use of our bounds by also proposing the first approximation techniques for the multiclass distribution divergences we consider. For divergences dependent on a Gibbs predictor, we propose additional PAC-Bayesian adaptation bounds which remove the need for inefficient Monte-Carlo estimation. Empirically, we test the efficacy of our proposed approximation techniques as well as some novel design-concepts which we include in our bounds. Finally, we apply our bounds to analyze a common adaptation algorithm that uses neural networks

arXiv.org e-Print Archive

A Survey on Negative Transfer

Author: Deng Lingfei
Wu Dongrui
Zhang Lei
Zhang Wen
Publication venue
Publication date: 09/07/2021
Field of study

Transfer learning (TL) tries to utilize data or knowledge from one or more source domains to facilitate the learning in a target domain. It is particularly useful when the target domain has few or no labeled data, due to annotation expense, privacy concerns, etc. Unfortunately, the effectiveness of TL is not always guaranteed. Negative transfer (NT), i.e., the source domain data/knowledge cause reduced learning performance in the target domain, has been a long-standing and challenging problem in TL. Various approaches to handle NT have been proposed in the literature. However, this filed lacks a systematic survey on the formalization of NT, their factors and the algorithms that handle NT. This paper proposes to fill this gap. First, the definition of negative transfer is considered and a taxonomy of the factors are discussed. Then, near fifty representative approaches for handling NT are categorized and reviewed, from four perspectives: secure transfer, domain similarity estimation, distant transfer and negative transfer mitigation. NT in related fields, e.g., multi-task learning, lifelong learning, and adversarial attacks are also discussed

arXiv.org e-Print Archive