Multi-task learning (MTL) in deep neural networks for NLP has recently
received increasing interest due to some compelling benefits, including its
potential to efficiently regularize models and to reduce the need for labeled
data. While it has brought significant improvements in a number of NLP tasks,
mixed results have been reported, and little is known about the conditions
under which MTL leads to gains in NLP. This paper sheds light on the specific
task relations that can lead to gains from MTL models over single-task setups.Comment: Accepted for publication at EACL 201