Search CORE

20 research outputs found

Exploiting easy data in online optimization

Author: Lazaric Alessandro
Neu Gergely
Sani Amir
Publication venue: HAL CCSD
Publication date: 08/12/2014
Field of study

International audienceWe consider the problem of online optimization, where a learner chooses a decision from a given decision set and suffers some loss associated with the decision and the state of the environment. The learner's objective is to minimize its cumulative regret against the best fixed decision in hindsight. Over the past few decades numerous variants have been considered, with many algorithms designed to achieve sub-linear regret in the worst case. However, this level of robustness comes at a cost. Proposed algorithms are often over-conservative, failing to adapt to the actual complexity of the loss sequence which is often far from the worst case. In this paper we introduce a general algorithm that, provided with a "safe" learning algorithm and an opportunistic "benchmark", can effectively combine good worst-case guarantees with much improved performance on "easy" data. We derive general theoretical bounds on the regret of the proposed algorithm and discuss its implementation in a wide range of applications, notably in the problem of learning with shifting experts (a recent COLT open problem). Finally, we provide numerical simulations in the setting of prediction with expert advice with comparisons to the state of the art

HAL - Lille 3

INRIA a CCSD electronic archive server

First-order regret bounds for combinatorial semi-bandits

Author: Neu Gergely
Publication venue
Publication date: 10/06/2015
Field of study

We consider the problem of online combinatorial optimization under semi-bandit feedback, where a learner has to repeatedly pick actions from a combinatorial decision set in order to minimize the total losses associated with its decisions. After making each decision, the learner observes the losses associated with its action, but not other losses. For this problem, there are several learning algorithms that guarantee that the learner's expected regret grows as

\widetilde{O}(\sqrt{T})

with the number of rounds

T

. In this paper, we propose an algorithm that improves this scaling to

\widetilde{O}(\sqrt{{L_T^*}})

, where

L_T^*

is the total loss of the best action. Our algorithm is among the first to achieve such guarantees in a partial-feedback scheme, and the first one to do so in a combinatorial setting.Comment: To appear at COLT 201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Online Learning with Low Rank Experts

Author: Hazan Elad
Koren Tomer
Livni Roi
Mansour Yishay
Publication venue
Publication date: 01/01/2016
Field of study

We consider the problem of prediction with expert advice when the losses of the experts have low-dimensional structure: they are restricted to an unknown

d

-dimensional subspace. We devise algorithms with regret bounds that are independent of the number of experts and depend only on the rank

d

. For the stochastic model we show a tight bound of

\Theta(\sqrt{dT})

, and extend it to a setting of an approximate

d

subspace. For the adversarial model we show an upper bound of

O(d\sqrt{T})

and a lower bound of

\Omega(\sqrt{dT})

arXiv.org e-Print Archive

Princeton University Open Access Repository

Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities

Author: Gyorgy A
Huang R
Lattimore T
Szepesvari C
Publication venue: Neutral Information Processing Systems Foundation, Inc.
Publication date: 12/08/2016
Field of study

The follow the leader (FTL) algorithm, perhaps the simplest of all online learning algorithms, is known to perform well when the loss functions it is used on are positively curved. In this paper we ask whether there are other “lucky” settings when FTL achieves sublinear, “small” regret. In particular, we study the fundamental problem of linear prediction over a non-empty convex, compact domain. Amongst other results, we prove that the curvature of the boundary of the domain can act as if the losses were curved: In this case, we prove that as long as the mean of the loss vectors have positive lengths bounded away from zero, FTL enjoys a logarithmic growth rate of regret, while, e.g., for polyhedral domains and stochastic data it enjoys finite expected regret. Building on a previously known meta-algorithm, we also get an algorithm that simultaneously enjoys the worst-case guarantees and the bound available for FTL

Spiral - Imperial College Digital Repository