5 research outputs found
A Principled Approach for Learning Task Similarity in Multitask Learning
Multitask learning aims at solving a set of related tasks simultaneously, by
exploiting the shared knowledge for improving the performance on individual
tasks. Hence, an important aspect of multitask learning is to understand the
similarities within a set of tasks. Previous works have incorporated this
similarity information explicitly (e.g., weighted loss for each task) or
implicitly (e.g., adversarial loss for feature adaptation), for achieving good
empirical performances. However, the theoretical motivations for adding task
similarity knowledge are often missing or incomplete. In this paper, we give a
different perspective from a theoretical point of view to understand this
practice. We first provide an upper bound on the generalization error of
multitask learning, showing the benefit of explicit and implicit task
similarity knowledge. We systematically derive the bounds based on two distinct
task similarity metrics: H divergence and Wasserstein distance. From these
theoretical results, we revisit the Adversarial Multi-task Neural Network,
proposing a new training algorithm to learn the task relation coefficients and
neural network parameters iteratively. We assess our new algorithm empirically
on several benchmarks, showing not only that we find interesting and robust
task relations, but that the proposed approach outperforms the baselines,
reaffirming the benefits of theoretical insight in algorithm design
Multi-task Learning by Leveraging the Semantic Information
One crucial objective of multi-task learning is to align distributions across
tasks so that the information between them can be transferred and shared.
However, existing approaches only focused on matching the marginal feature
distribution while ignoring the semantic information, which may hinder the
learning performance. To address this issue, we propose to leverage the label
information in multi-task learning by exploring the semantic conditional
relations among tasks. We first theoretically analyze the generalization bound
of multi-task learning based on the notion of Jensen-Shannon divergence, which
provides new insights into the value of label information in multi-task
learning. Our analysis also leads to a concrete algorithm that jointly matches
the semantic distribution and controls label distribution divergence. To
confirm the effectiveness of the proposed method, we first compare the
algorithm with several baselines on some benchmarks and then test the
algorithms under label space shift conditions. Empirical results demonstrate
that the proposed method could outperform most baselines and achieve
state-of-the-art performance, particularly showing the benefits under the label
shift conditions
Online Boosting Algorithms for Anytime Transfer and Multitask Learning
The related problems of transfer learning and multitask learning have attracted significant attention, generating a rich literature of models and algorithms. Yet most existing approaches are studied in an offline fashion, implicitly assuming that data from different domains are given as a batch. Such an assumption is not valid in many real-world applications where data samples arrive sequentially, and one wants a good learner even from few examples. The goal of our work is to provide sound extensions to existing transfer and multitask learning algorithms such that they can be used in an anytime setting. More specifically, we propose two novel online boosting algorithms, one for transfer learning and one for multitask learning, both designed to leverage the knowledge of instances in other domains. The experimental results show state-of-the-art empirical performance on standard benchmarks, and we present results of using our methods for effectively detecting new seizures in patients with epilepsy from very few previous samples