Gradient Coordination for Quantifying and Maximizing Knowledge
  Transference in Multi-Task Learning

Liu, Shaoguo; Wang, Liang; Yang, Xuanhua; Zhao, Jianxin; Zheng, Bo

Gradient Coordination for Quantifying and Maximizing Knowledge Transference in Multi-Task Learning

Authors: Shaoguo Liu
Liang Wang
Xuanhua Yang
Jianxin Zhao
Bo Zheng
Publication date: 10 March 2023
Publisher

Abstract

Multi-task learning (MTL) has been widely applied in online advertising and recommender systems. To address the negative transfer issue, recent studies have proposed optimization methods that thoroughly focus on the gradient alignment of directions or magnitudes. However, since prior study has proven that both general and specific knowledge exist in the limited shared capacity, overemphasizing on gradient alignment may crowd out task-specific knowledge, and vice versa. In this paper, we propose a transference-driven approach CoGrad that adaptively maximizes knowledge transference via Coordinated Gradient modification. We explicitly quantify the transference as loss reduction from one task to another, and then derive an auxiliary gradient from optimizing it. We perform the optimization by incorporating this gradient into original task gradients, making the model automatically maximize inter-task transfer and minimize individual losses. Thus, CoGrad can harmonize between general and specific knowledge to boost overall performance. Besides, we introduce an efficient approximation of the Hessian matrix, making CoGrad computationally efficient and simple to implement. Both offline and online experiments verify that CoGrad significantly outperforms previous methods.Comment: 5 pages, 3 figure

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2303.05847

Last time updated on 24/03/2023