Search CORE

48,994 research outputs found

Combining Expert Advice Efficiently

Author: de Rooij Steven
Koolen Wouter
Publication venue
Publication date: 01/01/2008
Field of study

We show how models for prediction with expert advice can be defined concisely and clearly using hidden Markov models (HMMs); standard HMM algorithms can then be used to efficiently calculate, among other things, how the expert predictions should be weighted according to the model. We cast many existing models as HMMs and recover the best known running times in each case. We also describe two new models: the switch distribution, which was recently developed to improve Bayesian/Minimum Description Length model selection, and a new generalisation of the fixed share algorithm based on run-length coding. We give loss bounds for all models and shed new light on their relationships.Comment: 50 page

arXiv.org e-Print Archive

CiteSeerX

UvA-DARE

Combining expert advice efficiently

Author: de Rooij S.
Koolen W.M.
Publication venue: Omnipress
Publication date: 01/01/2008
Field of study

The Computational Power of Optimization in Online Learning

Author: Agarwal A.
Agarwal A.
Dani V.
Dud´ık M.
Gofer E.
Hazan E.
Kakade S.
McMahan H. B.
Shalev-Shwartz S.
Zinkevich M.
Publication venue
Publication date: 27/01/2016
Field of study

We consider the fundamental problem of prediction with expert advice where the experts are "optimizable": there is a black-box optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point in time. In this setting, we give a novel online algorithm that attains vanishing regret with respect to

N

experts in total

\widetilde{O}(\sqrt{N})

computation time. We also give a lower bound showing that this running time cannot be improved (up to log factors) in the oracle model, thereby exhibiting a quadratic speedup as compared to the standard, oracle-free setting where the required time for vanishing regret is

\widetilde{\Theta}(N)

. These results demonstrate an exponential gap between the power of optimization in online learning and its power in statistical learning: in the latter, an optimization oracle---i.e., an efficient empirical risk minimizer---allows to learn a finite hypothesis class of size

N

in time

O(\log{N})

. We also study the implications of our results to learning in repeated zero-sum games, in a setting where the players have access to oracles that compute, in constant time, their best-response to any mixed strategy of their opponent. We show that the runtime required for approximating the minimax value of the game in this setting is

\widetilde{\Theta}(\sqrt{N})

, yielding again a quadratic improvement upon the oracle-free setting, where

\widetilde{\Theta}(N)

is known to be tight

arXiv.org e-Print Archive