75 research outputs found
Approachability of Convex Sets in Games with Partial Monitoring
We provide a necessary and sufficient condition under which a convex set is
approachable in a game with partial monitoring, i.e.\ where players do not
observe their opponents' moves but receive random signals. This condition is an
extension of Blackwell's Criterion in the full monitoring framework, where
players observe at least their payoffs. When our condition is fulfilled, we
construct explicitly an approachability strategy, derived from a strategy
satisfying some internal consistency property in an auxiliary game. We also
provide an example of a convex set, that is neither (weakly)-approachable nor
(weakly)-excludable, a situation that cannot occur in the full monitoring case.
We finally apply our result to describe an -optimal strategy of the
uninformed player in a zero-sum repeated game with incomplete information on
one side
On an unified framework for approachability in games with or without signals
We unify standard frameworks for approachability both in full or partial
monitoring by defining a new abstract game, called the "purely informative
game", where the outcome at each stage is the maximal information players can
obtain, represented as some probability measure. Objectives of players can be
rewritten as the convergence (to some given set) of sequences of averages of
these probability measures. We obtain new results extending the approachability
theory developed by Blackwell moreover this new abstract framework enables us
to characterize approachable sets with, as usual, a remarkably simple and clear
reformulation for convex sets. Translated into the original games, those
results become the first necessary and sufficient condition under which an
arbitrary set is approachable and they cover and extend previous known results
for convex sets. We also investigate a specific class of games where, thanks to
some unusual definition of averages and convexity, we again obtain a complete
characterization of approachable sets along with rates of convergence
Robust approachability and regret minimization in games with partial monitoring
Approachability has become a standard tool in analyzing earning algorithms in
the adversarial online learning setup. We develop a variant of approachability
for games where there is ambiguity in the obtained reward that belongs to a
set, rather than being a single vector. Using this variant we tackle the
problem of approachability in games with partial monitoring and develop simple
and efficient algorithms (i.e., with constant per-step complexity) for this
setup. We finally consider external regret and internal regret in repeated
games with partial monitoring and derive regret-minimizing strategies based on
approachability theory
Calibration and Internal no-Regret with Partial Monitoring
Calibrated strategies can be obtained by performing strategies that have no
internal regret in some auxiliary game. Such strategies can be constructed
explicitly with the use of Blackwell's approachability theorem, in an other
auxiliary game. We establish the converse: a strategy that approaches a convex
-set can be derived from the construction of a calibrated strategy. We
develop these tools in the framework of a game with partial monitoring, where
players do not observe the actions of their opponents but receive random
signals, to define a notion of internal regret and construct strategies that
have no such regret
Approachability in unknown games: Online learning meets multi-objective optimization
In the standard setting of approachability there are two players and a target
set. The players play repeatedly a known vector-valued game where the first
player wants to have the average vector-valued payoff converge to the target
set which the other player tries to exclude it from this set. We revisit this
setting in the spirit of online learning and do not assume that the first
player knows the game structure: she receives an arbitrary vector-valued reward
vector at every round. She wishes to approach the smallest ("best") possible
set given the observed average payoffs in hindsight. This extension of the
standard setting has implications even when the original target set is not
approachable and when it is not obvious which expansion of it should be
approached instead. We show that it is impossible, in general, to approach the
best target set in hindsight and propose achievable though ambitious
alternative goals. We further propose a concrete strategy to approach these
goals. Our method does not require projection onto a target set and amounts to
switching between scalar regret minimization algorithms that are performed in
episodes. Applications to global cost minimization and to approachability under
sample path constraints are considered
Minimizing Regret: The General Case
In repeated games with differential information on one side, the labelling "general case" refers to games in which the action of the informed player is not known to the uninformed, who can only observe a signal which is the random outcome of his and his opponent's action. Here we consider the problem of minimizing regret (in the sense first formulated by Hannan [8]) when the information available is of this type. We give a simple condition describing the approachable set.Minimize regret;differential information;approachability
- …