6,907 research outputs found
Cross-stitch Networks for Multi-task Learning
Multi-task learning in Convolutional Networks has displayed remarkable
success in the field of recognition. This success can be largely attributed to
learning shared representations from multiple supervisory tasks. However,
existing multi-task approaches rely on enumerating multiple network
architectures specific to the tasks at hand, that do not generalize. In this
paper, we propose a principled approach to learn shared representations in
ConvNets using multi-task learning. Specifically, we propose a new sharing
unit: "cross-stitch" unit. These units combine the activations from multiple
networks and can be trained end-to-end. A network with cross-stitch units can
learn an optimal combination of shared and task-specific representations. Our
proposed method generalizes across multiple tasks and shows dramatically
improved performance over baseline methods for categories with few training
examples.Comment: To appear in CVPR 2016 (Spotlight
Latent Multi-task Architecture Learning
Multi-task learning (MTL) allows deep neural networks to learn from related
tasks by sharing parameters with other networks. In practice, however, MTL
involves searching an enormous space of possible parameter sharing
architectures to find (a) the layers or subspaces that benefit from sharing,
(b) the appropriate amount of sharing, and (c) the appropriate relative weights
of the different task losses. Recent work has addressed each of the above
problems in isolation. In this work we present an approach that learns a latent
multi-task architecture that jointly addresses (a)--(c). We present experiments
on synthetic data and data from OntoNotes 5.0, including four different tasks
and seven different domains. Our extension consistently outperforms previous
approaches to learning latent architectures for multi-task problems and
achieves up to 15% average error reductions over common approaches to MTL.Comment: To appear in Proceedings of AAAI 201
Many Task Learning with Task Routing
Typical multi-task learning (MTL) methods rely on architectural adjustments
and a large trainable parameter set to jointly optimize over several tasks.
However, when the number of tasks increases so do the complexity of the
architectural adjustments and resource requirements. In this paper, we
introduce a method which applies a conditional feature-wise transformation over
the convolutional activations that enables a model to successfully perform a
large number of tasks. To distinguish from regular MTL, we introduce Many Task
Learning (MaTL) as a special case of MTL where more than 20 tasks are performed
by a single model. Our method dubbed Task Routing (TR) is encapsulated in a
layer we call the Task Routing Layer (TRL), which applied in an MaTL scenario
successfully fits hundreds of classification tasks in one model. We evaluate
our method on 5 datasets against strong baselines and state-of-the-art
approaches.Comment: 8 Pages, 5 Figures, 2 Table
- …