4 research outputs found
Efficient and Scalable Multi-task Regression on Massive Number of Tasks
Many real-world large-scale regression problems can be formulated as
Multi-task Learning (MTL) problems with a massive number of tasks, as in retail
and transportation domains. However, existing MTL methods still fail to offer
both the generalization performance and the scalability for such problems.
Scaling up MTL methods to problems with a tremendous number of tasks is a big
challenge. Here, we propose a novel algorithm, named Convex Clustering
Multi-Task regression Learning (CCMTL), which integrates with convex clustering
on the k-nearest neighbor graph of the prediction models. Further, CCMTL
efficiently solves the underlying convex problem with a newly proposed
optimization method. CCMTL is accurate, efficient to train, and empirically
scales linearly in the number of tasks. On both synthetic and real-world
datasets, the proposed CCMTL outperforms seven state-of-the-art (SoA)
multi-task learning methods in terms of prediction accuracy as well as
computational efficiency. On a real-world retail dataset with 23,812 tasks,
CCMTL requires only around 30 seconds to train on a single thread, while the
SoA methods need up to hours or even days.Comment: Accepted at AAAI 201