26 research outputs found

    A Geometric Proof of Calibration

    Get PDF
    We provide yet another proof of the existence of calibrated forecasters; it has two merits. First, it is valid for an arbitrary finite number of outcomes. Second, it is short and simple and it follows from a direct application of Blackwell's approachability theorem to carefully chosen vector-valued payoff function and convex target set. Our proof captures the essence of existing proofs based on approachability (e.g., the proof by Foster, 1999 in case of binary outcomes) and highlights the intrinsic connection between approachability and calibration

    Approachability in unknown games: Online learning meets multi-objective optimization

    Full text link
    In the standard setting of approachability there are two players and a target set. The players play repeatedly a known vector-valued game where the first player wants to have the average vector-valued payoff converge to the target set which the other player tries to exclude it from this set. We revisit this setting in the spirit of online learning and do not assume that the first player knows the game structure: she receives an arbitrary vector-valued reward vector at every round. She wishes to approach the smallest ("best") possible set given the observed average payoffs in hindsight. This extension of the standard setting has implications even when the original target set is not approachable and when it is not obvious which expansion of it should be approached instead. We show that it is impossible, in general, to approach the best target set in hindsight and propose achievable though ambitious alternative goals. We further propose a concrete strategy to approach these goals. Our method does not require projection onto a target set and amounts to switching between scalar regret minimization algorithms that are performed in episodes. Applications to global cost minimization and to approachability under sample path constraints are considered

    Learning Aided Optimization for Energy Harvesting Devices with Outdated State Information

    Full text link
    This paper considers utility optimal power control for energy harvesting wireless devices with a finite capacity battery. The distribution information of the underlying wireless environment and harvestable energy is unknown and only outdated system state information is known at the device controller. This scenario shares similarity with Lyapunov opportunistic optimization and online learning but is different from both. By a novel combination of Zinkevich's online gradient learning technique and the drift-plus-penalty technique from Lyapunov opportunistic optimization, this paper proposes a learning-aided algorithm that achieves utility within O()O(\epsilon) of the optimal, for any desired >0\epsilon>0, by using a battery with an O(1/)O(1/\epsilon) capacity. The proposed algorithm has low complexity and makes power investment decisions based on system history, without requiring knowledge of the system state or its probability distribution.Comment: This version extends v1 (our INFOCOM 2018 paper): (1) add a new section (Section V) to study the case where utility functions are non-i.i.d. arbitrarily varying (2) add more simulation experiments. The current version is published in IEEE/ACM Transactions on Networkin
    corecore