One crucial objective of multi-task learning is to align distributions across
tasks so that the information between them can be transferred and shared.
However, existing approaches only focused on matching the marginal feature
distribution while ignoring the semantic information, which may hinder the
learning performance. To address this issue, we propose to leverage the label
information in multi-task learning by exploring the semantic conditional
relations among tasks. We first theoretically analyze the generalization bound
of multi-task learning based on the notion of Jensen-Shannon divergence, which
provides new insights into the value of label information in multi-task
learning. Our analysis also leads to a concrete algorithm that jointly matches
the semantic distribution and controls label distribution divergence. To
confirm the effectiveness of the proposed method, we first compare the
algorithm with several baselines on some benchmarks and then test the
algorithms under label space shift conditions. Empirical results demonstrate
that the proposed method could outperform most baselines and achieve
state-of-the-art performance, particularly showing the benefits under the label
shift conditions