Search CORE

13 research outputs found

Online Multi-task Learning with Hard Constraints

Author: Lugosi Gabor
Papaspiliopoulos Omiros
Stoltz Gilles
Publication venue
Publication date: 01/01/2009
Field of study

We discuss multi-task online learning when a decision maker has to deal simultaneously with M tasks. The tasks are related, which is modeled by imposing that the M-tuple of actions taken by the decision maker needs to satisfy certain constraints. We give natural examples of such restrictions and then discuss a general class of tractable constraints, for which we introduce computationally efficient ways of selecting actions, essentially by reducing to an on-line shortest path problem. We briefly discuss "tracking" and "bandit" versions of the problem and extend the model in various ways, including non-additive global losses and uncountably infinite sets of tasks

arXiv.org e-Print Archive

CiteSeerX

Multitask Online Mirror Descent

Author: Cesa-Bianchi Nicolò
Laforgue Pierre
Paudice Andrea
Pontil Massimiliano
Publication venue
Publication date: 22/10/2021
Field of study

We introduce and analyze MT-OMD, a multitask generalization of Online Mirror Descent (OMD) which operates by sharing updates between tasks. We prove that the regret of MT-OMD is of order

\sqrt{1 + \sigma^2(N-1)}\sqrt{T}

, where

\sigma^2

is the task variance according to the geometry induced by the regularizer,

N

is the number of tasks, and

T

is the time horizon. Whenever tasks are similar, that is

\sigma^2 \le 1

, our method improves upon the

\sqrt{NT}

bound obtained by running independent OMDs on each task. We further provide a matching lower bound, and show that our multitask extensions of Online Gradient Descent and Exponentiated Gradient, two major instances of OMD, enjoy closed-form updates, making them easy to use in practice. Finally, we present experiments on both synthetic and real-world datasets supporting our findings

arXiv.org e-Print Archive

On the Sample Complexity of Representation Learning in Multi-task Bandits with Global and Local structure

Author: Proutiere Alexandre
Russo Alessio
Publication venue
Publication date: 28/11/2022
Field of study

We investigate the sample complexity of learning the optimal arm for multi-task bandit problems. Arms consist of two components: one that is shared across tasks (that we call representation) and one that is task-specific (that we call predictor). The objective is to learn the optimal (representation, predictor)-pair for each task, under the assumption that the optimal representation is common to all tasks. Within this framework, efficient learning algorithms should transfer knowledge across tasks. We consider the best-arm identification problem for a fixed confidence, where, in each round, the learner actively selects both a task, and an arm, and observes the corresponding reward. We derive instance-specific sample complexity lower bounds satisfied by any

(\delta_G,\delta_H)

-PAC algorithm (such an algorithm identifies the best representation with probability at least

1-\delta_G

, and the best predictor for a task with probability at least

1-\delta_H

). We devise an algorithm OSRL-SC whose sample complexity approaches the lower bound, and scales at most as

H(G\log(1/\delta_G)+ X\log(1/\delta_H))

, with

X,G,H

being, respectively, the number of tasks, representations and predictors. By comparison, this scaling is significantly better than the classical best-arm identification algorithm that scales as

HGX\log(1/\delta)

.Comment: Accepted at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI23

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Online Multitask Learning with Long-Term Memory

Author: Herbster Mark
Pasteris Stephen
Tse Lisa
Publication venue
Publication date: 15/02/2020
Field of study

We introduce a novel online multitask setting. In this setting each task is partitioned into a sequence of segments that is unknown to the learner. Associated with each segment is a hypothesis from some hypothesis class. We give algorithms that are designed to exploit the scenario where there are many such segments but significantly fewer associated hypotheses. We prove regret bounds that hold for any segmentation of the tasks and any association of hypotheses to the segments. In the single-task setting this is equivalent to switching with long-term memory in the sense of [Bousquet and Warmuth; 2003]. We provide an algorithm that predicts on each trial in time linear in the number of hypotheses when the hypothesis class is finite. We also consider infinite hypothesis classes from reproducing kernel Hilbert spaces for which we give an algorithm whose per trial time complexity is cubic in the number of cumulative trials. In the single-task special case this is the first example of an efficient regret-bounded switching algorithm with long-term memory for a non-parametric hypothesis class

arXiv.org e-Print Archive

UCL Discovery

Online Transfer Learning

Author: HOI Steven C. H.
LI Bin
WANG Jialei
ZHAO Peilin
Publication venue: 'Elsevier BV'
Publication date: 01/11/2014
Field of study

Crossref

Institutional Knowledge at Singapore Management University