1,951 research outputs found
Self-Paced Multitask Learning with Shared Knowledge
This paper introduces self-paced task selection to multitask learning, where
instances from more closely related tasks are selected in a progression of
easier-to-harder tasks, to emulate an effective human education strategy, but
applied to multitask machine learning. We develop the mathematical foundation
for the approach based on iterative selection of the most appropriate task,
learning the task parameters, and updating the shared knowledge, optimizing a
new bi-convex loss function. This proposed method applies quite generally,
including to multitask feature learning, multitask learning with alternating
structure optimization, etc. Results show that in each of the above
formulations self-paced (easier-to-harder) task selection outperforms the
baseline version of these methods in all the experiments
The Benefit of Multitask Representation Learning
We discuss a general method to learn data representations from multiple
tasks. We provide a justification for this method in both settings of multitask
learning and learning-to-learn. The method is illustrated in detail in the
special case of linear feature learning. Conditions on the theoretical
advantage offered by multitask representation learning over independent task
learning are established. In particular, focusing on the important example of
half-space learning, we derive the regime in which multitask representation
learning is beneficial over independent task learning, as a function of the
sample size, the number of tasks and the intrinsic data dimensionality. Other
potential applications of our results include multitask feature learning in
reproducing kernel Hilbert spaces and multilayer, deep networks.Comment: To appear in Journal of Machine Learning Research (JMLR). 31 page
Latent Multi-task Architecture Learning
Multi-task learning (MTL) allows deep neural networks to learn from related
tasks by sharing parameters with other networks. In practice, however, MTL
involves searching an enormous space of possible parameter sharing
architectures to find (a) the layers or subspaces that benefit from sharing,
(b) the appropriate amount of sharing, and (c) the appropriate relative weights
of the different task losses. Recent work has addressed each of the above
problems in isolation. In this work we present an approach that learns a latent
multi-task architecture that jointly addresses (a)--(c). We present experiments
on synthetic data and data from OntoNotes 5.0, including four different tasks
and seven different domains. Our extension consistently outperforms previous
approaches to learning latent architectures for multi-task problems and
achieves up to 15% average error reductions over common approaches to MTL.Comment: To appear in Proceedings of AAAI 201
- …