11,994 research outputs found

    Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond

    Full text link
    The vast majority of existing algorithms for unsupervised domain adaptation (UDA) focus on adapting from a labeled source domain to an unlabeled target domain directly in a one-off way. Gradual domain adaptation (GDA), on the other hand, assumes a path of (Tβˆ’1)(T-1) unlabeled intermediate domains bridging the source and target, and aims to provide better generalization in the target domain by leveraging the intermediate ones. Under certain assumptions, Kumar et al. (2020) proposed a simple algorithm, Gradual Self-Training, along with a generalization bound in the order of eO(T)(Ξ΅0+O(log(T)/n))e^{O(T)} \left(\varepsilon_0+O\left(\sqrt{log(T)/n}\right)\right) for the target domain error, where Ξ΅0\varepsilon_0 is the source domain error and nn is the data size of each domain. Due to the exponential factor, this upper bound becomes vacuous when TT is only moderately large. In this work, we analyze gradual self-training under more general and relaxed assumptions, and prove a significantly improved generalization bound as O~(Ξ΅0+TΞ”+T/n+1/nT)\widetilde{O}\left(\varepsilon_0 + T\Delta + T/\sqrt{n} + 1/\sqrt{nT}\right), where Ξ”\Delta is the average distributional distance between consecutive domains. Compared with the existing bound with an exponential dependency on TT as a multiplicative factor, our bound only depends on TT linearly and additively. Perhaps more interestingly, our result implies the existence of an optimal choice of TT that minimizes the generalization error, and it also naturally suggests an optimal way to construct the path of intermediate domains so as to minimize the accumulative path length TΞ”T\Delta between the source and target. To corroborate the implications of our theory, we examine gradual self-training on multiple semi-synthetic and real datasets, which confirms our findings. We believe our insights provide a path forward toward the design of future GDA algorithms.Comment: The code will be released at https://github.com/Haoxiang-Wang/gradual-domain-adaptatio

    PAC-Bayesian Domain Adaptation Bounds for Multiclass Learners

    Full text link
    Multiclass neural networks are a common tool in modern unsupervised domain adaptation, yet an appropriate theoretical description for their non-uniform sample complexity is lacking in the adaptation literature. To fill this gap, we propose the first PAC-Bayesian adaptation bounds for multiclass learners. We facilitate practical use of our bounds by also proposing the first approximation techniques for the multiclass distribution divergences we consider. For divergences dependent on a Gibbs predictor, we propose additional PAC-Bayesian adaptation bounds which remove the need for inefficient Monte-Carlo estimation. Empirically, we test the efficacy of our proposed approximation techniques as well as some novel design-concepts which we include in our bounds. Finally, we apply our bounds to analyze a common adaptation algorithm that uses neural networks

    A Survey on Negative Transfer

    Full text link
    Transfer learning (TL) tries to utilize data or knowledge from one or more source domains to facilitate the learning in a target domain. It is particularly useful when the target domain has few or no labeled data, due to annotation expense, privacy concerns, etc. Unfortunately, the effectiveness of TL is not always guaranteed. Negative transfer (NT), i.e., the source domain data/knowledge cause reduced learning performance in the target domain, has been a long-standing and challenging problem in TL. Various approaches to handle NT have been proposed in the literature. However, this filed lacks a systematic survey on the formalization of NT, their factors and the algorithms that handle NT. This paper proposes to fill this gap. First, the definition of negative transfer is considered and a taxonomy of the factors are discussed. Then, near fifty representative approaches for handling NT are categorized and reviewed, from four perspectives: secure transfer, domain similarity estimation, distant transfer and negative transfer mitigation. NT in related fields, e.g., multi-task learning, lifelong learning, and adversarial attacks are also discussed
    • …
    corecore