11 research outputs found
Strategies for prediction under imperfect monitoring
We propose simple randomized strategies for sequential prediction under
imperfect monitoring, that is, when the forecaster does not have access to the
past outcomes but rather to a feedback signal. The proposed strategies are
consistent in the sense that they achieve, asymptotically, the best possible
average reward. It was Rustichini (1999) who first proved the existence of such
consistent predictors. The forecasters presented here offer the first
constructive proof of consistency. Moreover, the proposed algorithms are
computationally efficient. We also establish upper bounds for the rates of
convergence. In the case of deterministic feedback, these rates are optimal up
to logarithmic terms.Comment: Journal version of a COLT conference pape
No Internal Regret via Neighborhood Watch
We present an algorithm which attains O(\sqrt{T}) internal (and thus
external) regret for finite games with partial monitoring under the local
observability condition. Recently, this condition has been shown by (Bartok,
Pal, and Szepesvari, 2011) to imply the O(\sqrt{T}) rate for partial monitoring
games against an i.i.d. opponent, and the authors conjectured that the same
holds for non-stochastic adversaries. Our result is in the affirmative, and it
completes the characterization of possible rates for finite partial-monitoring
games, an open question stated by (Cesa-Bianchi, Lugosi, and Stoltz, 2006). Our
regret guarantees also hold for the more general model of partial monitoring
with random signals
Calibration and Internal no-Regret with Partial Monitoring
Calibrated strategies can be obtained by performing strategies that have no
internal regret in some auxiliary game. Such strategies can be constructed
explicitly with the use of Blackwell's approachability theorem, in an other
auxiliary game. We establish the converse: a strategy that approaches a convex
-set can be derived from the construction of a calibrated strategy. We
develop these tools in the framework of a game with partial monitoring, where
players do not observe the actions of their opponents but receive random
signals, to define a notion of internal regret and construct strategies that
have no such regret
Robust approachability and regret minimization in games with partial monitoring
Approachability has become a standard tool in analyzing earning algorithms in
the adversarial online learning setup. We develop a variant of approachability
for games where there is ambiguity in the obtained reward that belongs to a
set, rather than being a single vector. Using this variant we tackle the
problem of approachability in games with partial monitoring and develop simple
and efficient algorithms (i.e., with constant per-step complexity) for this
setup. We finally consider external regret and internal regret in repeated
games with partial monitoring and derive regret-minimizing strategies based on
approachability theory
Robust approachability and regret minimization in games with partial monitoring
Approachability has become a standard tool in analyzing earning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in the obtained reward that belongs to a set, rather than being a single vector. Using this variant we tackle the problem of approachability in games with partial monitoring and develop simple and efficient algorithms (i.e., with constant per-step complexity) for this setup. We finally consider external regret and internal regret in repeated games with partial monitoring and derive regret-minimizing strategies based on approachability theory