18 research outputs found
A Principled Approach for Learning Task Similarity in Multitask Learning
Multitask learning aims at solving a set of related tasks simultaneously, by
exploiting the shared knowledge for improving the performance on individual
tasks. Hence, an important aspect of multitask learning is to understand the
similarities within a set of tasks. Previous works have incorporated this
similarity information explicitly (e.g., weighted loss for each task) or
implicitly (e.g., adversarial loss for feature adaptation), for achieving good
empirical performances. However, the theoretical motivations for adding task
similarity knowledge are often missing or incomplete. In this paper, we give a
different perspective from a theoretical point of view to understand this
practice. We first provide an upper bound on the generalization error of
multitask learning, showing the benefit of explicit and implicit task
similarity knowledge. We systematically derive the bounds based on two distinct
task similarity metrics: H divergence and Wasserstein distance. From these
theoretical results, we revisit the Adversarial Multi-task Neural Network,
proposing a new training algorithm to learn the task relation coefficients and
neural network parameters iteratively. We assess our new algorithm empirically
on several benchmarks, showing not only that we find interesting and robust
task relations, but that the proposed approach outperforms the baselines,
reaffirming the benefits of theoretical insight in algorithm design
Discovering Domain Disentanglement for Generalized Multi-source Domain Adaptation
A typical multi-source domain adaptation (MSDA) approach aims to transfer
knowledge learned from a set of labeled source domains, to an unlabeled target
domain. Nevertheless, prior works strictly assume that each source domain
shares the identical group of classes with the target domain, which could
hardly be guaranteed as the target label space is not observable. In this
paper, we consider a more versatile setting of MSDA, namely Generalized
Multi-source Domain Adaptation, wherein the source domains are partially
overlapped, and the target domain is allowed to contain novel categories that
are not presented in any source domains. This new setting is more elusive than
any existing domain adaptation protocols due to the coexistence of the domain
and category shifts across the source and target domains. To address this
issue, we propose a variational domain disentanglement (VDD) framework, which
decomposes the domain representations and semantic features for each instance
by encouraging dimension-wise independence. To identify the target samples of
unknown classes, we leverage online pseudo labeling, which assigns the
pseudo-labels to unlabeled target data based on the confidence scores.
Quantitative and qualitative experiments conducted on two benchmark datasets
demonstrate the validity of the proposed framework
Multi-Prompt Alignment for Multi-source Unsupervised Domain Adaptation
Most existing methods for multi-source unsupervised domain adaptation (UDA)
rely on a common feature encoder to extract domain-invariant features. However,
learning such an encoder involves updating the parameters of the entire
network, which makes the optimization computationally expensive, particularly
when coupled with min-max objectives. Inspired by recent advances in prompt
learning that adapts high-capacity deep models for downstream tasks in a
computationally economic way, we introduce Multi-Prompt Alignment (MPA), a
simple yet efficient two-stage framework for multi-source UDA. Given a source
and target domain pair, MPA first trains an individual prompt to minimize the
domain gap through a contrastive loss, while tuning only a small set of
parameters. Then, MPA derives a low-dimensional latent space through an
auto-encoding process that maximizes the agreement of multiple learned prompts.
The resulting embedding further facilitates generalization to unseen domains.
Extensive experiments show that our method achieves state-of-the-art results on
popular benchmark datasets while requiring substantially fewer tunable
parameters. To the best of our knowledge, we are the first to apply prompt
learning to the multi-source UDA problem and our method achieves the highest
reported average accuracy of 54.1% on DomainNet, the most challenging UDA
dataset to date, with only 15.9M parameters trained. More importantly, we
demonstrate that the learned embedding space can be easily adapted to novel
unseen domains
Multi-task Learning by Leveraging the Semantic Information
One crucial objective of multi-task learning is to align distributions across
tasks so that the information between them can be transferred and shared.
However, existing approaches only focused on matching the marginal feature
distribution while ignoring the semantic information, which may hinder the
learning performance. To address this issue, we propose to leverage the label
information in multi-task learning by exploring the semantic conditional
relations among tasks. We first theoretically analyze the generalization bound
of multi-task learning based on the notion of Jensen-Shannon divergence, which
provides new insights into the value of label information in multi-task
learning. Our analysis also leads to a concrete algorithm that jointly matches
the semantic distribution and controls label distribution divergence. To
confirm the effectiveness of the proposed method, we first compare the
algorithm with several baselines on some benchmarks and then test the
algorithms under label space shift conditions. Empirical results demonstrate
that the proposed method could outperform most baselines and achieve
state-of-the-art performance, particularly showing the benefits under the label
shift conditions