4 research outputs found
Learning Sparse Sharing Architectures for Multiple Tasks
Most existing deep multi-task learning models are based on parameter sharing,
such as hard sharing, hierarchical sharing, and soft sharing. How choosing a
suitable sharing mechanism depends on the relations among the tasks, which is
not easy since it is difficult to understand the underlying shared factors
among these tasks. In this paper, we propose a novel parameter sharing
mechanism, named \emph{Sparse Sharing}. Given multiple tasks, our approach
automatically finds a sparse sharing structure. We start with an
over-parameterized base network, from which each task extracts a subnetwork.
The subnetworks of multiple tasks are partially overlapped and trained in
parallel. We show that both hard sharing and hierarchical sharing can be
formulated as particular instances of the sparse sharing framework. We conduct
extensive experiments on three sequence labeling tasks. Compared with
single-task models and three typical multi-task learning baselines, our
proposed approach achieves consistent improvement while requiring fewer
parameters.Comment: Accepted by AAAI 202
Learning Multi-Task Communication with Message Passing for Sequence Learning
We present two architectures for multi-task learning with neural sequence models. Our approach allows the relationships between different tasks to be learned dynamically, rather than using an ad-hoc pre-defined structure as in previous work. We adopt the idea from message-passing graph neural networks, and propose a general graph multi-task learning framework in which different tasks can communicate with each other in an effective and interpretable way. We conduct extensive experiments in text classification and sequence labelling to evaluate our approach on multi-task learning and transfer learning. The empirical results show that our models not only outperform competitive baselines, but also learn interpretable and transferable patterns across tasks