Search CORE

364 research outputs found

Calibration and Internal no-Regret with Partial Monitoring

Author: Perchet Vianney
Publication venue
Publication date: 01/01/2010
Field of study

Calibrated strategies can be obtained by performing strategies that have no internal regret in some auxiliary game. Such strategies can be constructed explicitly with the use of Blackwell's approachability theorem, in an other auxiliary game. We establish the converse: a strategy that approaches a convex

B

-set can be derived from the construction of a calibrated strategy. We develop these tools in the framework of a game with partial monitoring, where players do not observe the actions of their opponents but receive random signals, to define a notion of internal regret and construct strategies that have no such regret

arXiv.org e-Print Archive

CiteSeerX

Hal-Diderot

Approachability of Convex Sets in Games with Partial Monitoring

Author: Perchet Vianney
Publication venue
Publication date: 08/06/2010
Field of study

We provide a necessary and sufficient condition under which a convex set is approachable in a game with partial monitoring, i.e.\ where players do not observe their opponents' moves but receive random signals. This condition is an extension of Blackwell's Criterion in the full monitoring framework, where players observe at least their payoffs. When our condition is fulfilled, we construct explicitly an approachability strategy, derived from a strategy satisfying some internal consistency property in an auxiliary game. We also provide an example of a convex set, that is neither (weakly)-approachable nor (weakly)-excludable, a situation that cannot occur in the full monitoring case. We finally apply our result to describe an

\epsilon

-optimal strategy of the uninformed player in a zero-sum repeated game with incomplete information on one side

arXiv.org e-Print Archive

Hal-Diderot

Highly-Smooth Zero-th Order Online Optimization Vianney Perchet

Author: Bach Francis
Perchet Vianney
Publication venue
Publication date: 26/05/2016
Field of study

The minimization of convex functions which are only available through partial and noisy information is a key methodological problem in many disciplines. In this paper we consider convex optimization with noisy zero-th order information, that is noisy function evaluations at any desired point. We focus on problems with high degrees of smoothness, such as logistic regression. We show that as opposed to gradient-based algorithms, high-order smoothness may be used to improve estimation rates, with a precise dependence of our upper-bounds on the degree of smoothness. In particular, we show that for infinitely differentiable functions, we recover the same dependence on sample size as gradient-based algorithms, with an extra dimension-dependent factor. This is done for both convex and strongly-convex functions, with finite horizon and anytime algorithms. Finally, we also recover similar results in the online optimization setting.Comment: Conference on Learning Theory (COLT), Jun 2016, New York, United States. 201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Polytechnique

On an unified framework for approachability in games with or without signals

Author: Perchet Vianney
Quincampoix Marc
Publication venue
Publication date: 16/01/2013
Field of study

We unify standard frameworks for approachability both in full or partial monitoring by defining a new abstract game, called the "purely informative game", where the outcome at each stage is the maximal information players can obtain, represented as some probability measure. Objectives of players can be rewritten as the convergence (to some given set) of sequences of averages of these probability measures. We obtain new results extending the approachability theory developed by Blackwell moreover this new abstract framework enables us to characterize approachable sets with, as usual, a remarkably simple and clear reformulation for convex sets. Translated into the original games, those results become the first necessary and sufficient condition under which an arbitrary set is approachable and they cover and extend previous known results for convex sets. We also investigate a specific class of games where, thanks to some unusual definition of averages and convexity, we again obtain a complete characterization of approachable sets along with rates of convergence

arXiv.org e-Print Archive

HAL-Université de Bretagne Occidentale

Hal-Diderot

Gains and Losses are Fundamentally Different in Regret Minimization: The Sparse Case

Author: Kwon Joon
Perchet Vianney
Publication venue
Publication date: 26/11/2015
Field of study

We demonstrate that, in the classical non-stochastic regret minimization problem with

d

decisions, gains and losses to be respectively maximized or minimized are fundamentally different. Indeed, by considering the additional sparsity assumption (at each stage, at most

s

decisions incur a nonzero outcome), we derive optimal regret bounds of different orders. Specifically, with gains, we obtain an optimal regret guarantee after

T

stages of order

\sqrt{T\log s}

, so the classical dependency in the dimension is replaced by the sparsity size. With losses, we provide matching upper and lower bounds of order

\sqrt{Ts\log(d)/d}

, which is decreasing in

d

. Eventually, we also study the bandit setting, and obtain an upper bound of order

\sqrt{Ts\log (d/s)}

when outcomes are losses. This bound is proven to be optimal up to the logarithmic factor

\sqrt{\log(d/s)}

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

Approachability in unknown games: Online learning meets multi-objective optimization

Author: Mannor Shie
Perchet Vianney
Stoltz Gilles
Publication venue
Publication date: 16/06/2016
Field of study

In the standard setting of approachability there are two players and a target set. The players play repeatedly a known vector-valued game where the first player wants to have the average vector-valued payoff converge to the target set which the other player tries to exclude it from this set. We revisit this setting in the spirit of online learning and do not assume that the first player knows the game structure: she receives an arbitrary vector-valued reward vector at every round. She wishes to approach the smallest ("best") possible set given the observed average payoffs in hindsight. This extension of the standard setting has implications even when the original target set is not approachable and when it is not obvious which expansion of it should be approached instead. We show that it is impossible, in general, to approach the best target set in hindsight and propose achievable though ambitious alternative goals. We further propose a concrete strategy to approach these goals. Our method does not require projection onto a target set and amounts to switching between scalar regret minimization algorithms that are performed in episodes. Applications to global cost minimization and to approachability under sample path constraints are considered

arXiv.org e-Print Archive

HAL-Polytechnique