Search CORE

449 research outputs found

Robust approachability and regret minimization in games with partial monitoring

Author: Mannor Shie
Perchet Vianney
Stoltz Gilles
Publication venue
Publication date: 01/01/2011
Field of study

Approachability has become a standard tool in analyzing earning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in the obtained reward that belongs to a set, rather than being a single vector. Using this variant we tackle the problem of approachability in games with partial monitoring and develop simple and efficient algorithms (i.e., with constant per-step complexity) for this setup. We finally consider external regret and internal regret in repeated games with partial monitoring and derive regret-minimizing strategies based on approachability theory

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

Calibration and Internal no-Regret with Partial Monitoring

Author: Perchet Vianney
Publication venue
Publication date: 01/01/2010
Field of study

Calibrated strategies can be obtained by performing strategies that have no internal regret in some auxiliary game. Such strategies can be constructed explicitly with the use of Blackwell's approachability theorem, in an other auxiliary game. We establish the converse: a strategy that approaches a convex

B

-set can be derived from the construction of a calibrated strategy. We develop these tools in the framework of a game with partial monitoring, where players do not observe the actions of their opponents but receive random signals, to define a notion of internal regret and construct strategies that have no such regret

arXiv.org e-Print Archive

CiteSeerX

Hal-Diderot

Approachability in unknown games: Online learning meets multi-objective optimization

Author: Mannor Shie
Perchet Vianney
Stoltz Gilles
Publication venue
Publication date: 16/06/2016
Field of study

In the standard setting of approachability there are two players and a target set. The players play repeatedly a known vector-valued game where the first player wants to have the average vector-valued payoff converge to the target set which the other player tries to exclude it from this set. We revisit this setting in the spirit of online learning and do not assume that the first player knows the game structure: she receives an arbitrary vector-valued reward vector at every round. She wishes to approach the smallest ("best") possible set given the observed average payoffs in hindsight. This extension of the standard setting has implications even when the original target set is not approachable and when it is not obvious which expansion of it should be approached instead. We show that it is impossible, in general, to approach the best target set in hindsight and propose achievable though ambitious alternative goals. We further propose a concrete strategy to approach these goals. Our method does not require projection onto a target set and amounts to switching between scalar regret minimization algorithms that are performed in episodes. Applications to global cost minimization and to approachability under sample path constraints are considered

arXiv.org e-Print Archive

HAL-Polytechnique

Introduction to Learning in Games: A Symposium in Honor of David Blackwell

Author: David K Levine
Dean Foster
Rakesh Vohra
Publication venue
Publication date
Field of study

Research Papers in Economics

A Geometric Proof of Calibration

Author: Mannor Shie
Stoltz Gilles
Publication venue
Publication date: 01/01/2009
Field of study

We provide yet another proof of the existence of calibrated forecasters; it has two merits. First, it is valid for an arbitrary finite number of outcomes. Second, it is short and simple and it follows from a direct application of Blackwell's approachability theorem to carefully chosen vector-valued payoff function and convex target set. Our proof captures the essence of existing proofs based on approachability (e.g., the proof by Foster, 1999 in case of binary outcomes) and highlights the intrinsic connection between approachability and calibration

arXiv.org e-Print Archive

CiteSeerX