13 research outputs found
Online Multi-task Learning with Hard Constraints
We discuss multi-task online learning when a decision maker has to deal
simultaneously with M tasks. The tasks are related, which is modeled by
imposing that the M-tuple of actions taken by the decision maker needs to
satisfy certain constraints. We give natural examples of such restrictions and
then discuss a general class of tractable constraints, for which we introduce
computationally efficient ways of selecting actions, essentially by reducing to
an on-line shortest path problem. We briefly discuss "tracking" and "bandit"
versions of the problem and extend the model in various ways, including
non-additive global losses and uncountably infinite sets of tasks
Multitask Online Mirror Descent
We introduce and analyze MT-OMD, a multitask generalization of Online Mirror
Descent (OMD) which operates by sharing updates between tasks. We prove that
the regret of MT-OMD is of order , where
is the task variance according to the geometry induced by the
regularizer, is the number of tasks, and is the time horizon. Whenever
tasks are similar, that is , our method improves upon the
bound obtained by running independent OMDs on each task. We further
provide a matching lower bound, and show that our multitask extensions of
Online Gradient Descent and Exponentiated Gradient, two major instances of OMD,
enjoy closed-form updates, making them easy to use in practice. Finally, we
present experiments on both synthetic and real-world datasets supporting our
findings
On the Sample Complexity of Representation Learning in Multi-task Bandits with Global and Local structure
We investigate the sample complexity of learning the optimal arm for
multi-task bandit problems. Arms consist of two components: one that is shared
across tasks (that we call representation) and one that is task-specific (that
we call predictor). The objective is to learn the optimal (representation,
predictor)-pair for each task, under the assumption that the optimal
representation is common to all tasks. Within this framework, efficient
learning algorithms should transfer knowledge across tasks. We consider the
best-arm identification problem for a fixed confidence, where, in each round,
the learner actively selects both a task, and an arm, and observes the
corresponding reward. We derive instance-specific sample complexity lower
bounds satisfied by any -PAC algorithm (such an algorithm
identifies the best representation with probability at least , and
the best predictor for a task with probability at least ). We
devise an algorithm OSRL-SC whose sample complexity approaches the lower bound,
and scales at most as , with
being, respectively, the number of tasks, representations and predictors. By
comparison, this scaling is significantly better than the classical best-arm
identification algorithm that scales as .Comment: Accepted at the Thirty-Seventh AAAI Conference on Artificial
Intelligence (AAAI23
Online Multitask Learning with Long-Term Memory
We introduce a novel online multitask setting. In this setting each task is
partitioned into a sequence of segments that is unknown to the learner.
Associated with each segment is a hypothesis from some hypothesis class. We
give algorithms that are designed to exploit the scenario where there are many
such segments but significantly fewer associated hypotheses. We prove regret
bounds that hold for any segmentation of the tasks and any association of
hypotheses to the segments. In the single-task setting this is equivalent to
switching with long-term memory in the sense of [Bousquet and Warmuth; 2003].
We provide an algorithm that predicts on each trial in time linear in the
number of hypotheses when the hypothesis class is finite. We also consider
infinite hypothesis classes from reproducing kernel Hilbert spaces for which we
give an algorithm whose per trial time complexity is cubic in the number of
cumulative trials. In the single-task special case this is the first example of
an efficient regret-bounded switching algorithm with long-term memory for a
non-parametric hypothesis class