21 research outputs found
Deterministic Calibration and Nash Equilibrium
We provide a natural learning process in which the joint frequency of empirical play converges into the set of convex combinations of Nash equilibria. In this process, all players rationally choose their actions using a public prediction made by a deterministic, weakly calibrated algorithm. Furthermore, the public predictions used in any given round play are frequently close to some Nash equilibrium of the game
Continuous and randomized defensive forecasting: unified view
Defensive forecasting is a method of transforming laws of probability (stated
in game-theoretic terms as strategies for Sceptic) into forecasting algorithms.
There are two known varieties of defensive forecasting: "continuous", in which
Sceptic's moves are assumed to depend on the forecasts in a (semi)continuous
manner and which produces deterministic forecasts, and "randomized", in which
the dependence of Sceptic's moves on the forecasts is arbitrary and
Forecaster's moves are allowed to be randomized. This note shows that the
randomized variety can be obtained from the continuous variety by smearing
Sceptic's moves to make them continuous.Comment: 10 pages. The new version: (1) relaxes the assumption that the
outcome space is finite, and now it is only assumed to be compact; (2) shows
that in the case where the outcome space is finite of cardinality C, the
randomized forecasts can be chosen concentrated on a finite set of
cardinality at most
A Geometric Proof of Calibration
We provide yet another proof of the existence of calibrated forecasters; it
has two merits. First, it is valid for an arbitrary finite number of outcomes.
Second, it is short and simple and it follows from a direct application of
Blackwell's approachability theorem to carefully chosen vector-valued payoff
function and convex target set. Our proof captures the essence of existing
proofs based on approachability (e.g., the proof by Foster, 1999 in case of
binary outcomes) and highlights the intrinsic connection between
approachability and calibration
On impossibility of sequential algorithmic forecasting
The problem of prediction future event given an individual
sequence of past events is considered. Predictions are given
in form of real numbers which are computed by some algorithm
using initial fragments
of an individual binary sequence
and can be interpreted as probabilities of the event
given this fragment.
According to Dawid\u27s {it prequential framework}
%we do not consider
%numbers as conditional probabilities generating by some
%overall probability distribution on the set of all possible events.
we consider partial forecasting algorithms which are
defined on all initial fragments of and can
be undefined outside the given sequence of outcomes.
We show that even for this large class of forecasting algorithms
combining outcomes of coin-tossing and transducer algorithm
it is possible to efficiently generate with probability close
to one sequences
for which any partial forecasting algorithm is failed by the
method of verifying called {it calibration}
Stochastic uncoupled dynamics and Nash equilibrium
In this paper we consider dynamic processes, in repeated games, that are subject to the natural informational restriction of uncoupledness. We study the almost sure convergence to Nash equilibria, and present a number of possibility and impossibility results. Basically, we show that if in addition to random moves some recall is introduced, then successful search procedures that are uncoupled can be devised. In particular, to get almost sure convergence to pure Nash equilibria when these exist, it su±ces to recall the last two periods of play.Uncoupled, Nash equilibrium, stochastic dynamics, bounded recall
Channel Selection for Network-assisted D2D Communication via No-Regret Bandit Learning with Calibrated Forecasting
We consider the distributed channel selection problem in the context of
device-to-device (D2D) communication as an underlay to a cellular network.
Underlaid D2D users communicate directly by utilizing the cellular spectrum but
their decisions are not governed by any centralized controller. Selfish D2D
users that compete for access to the resources construct a distributed system,
where the transmission performance depends on channel availability and quality.
This information, however, is difficult to acquire. Moreover, the adverse
effects of D2D users on cellular transmissions should be minimized. In order to
overcome these limitations, we propose a network-assisted distributed channel
selection approach in which D2D users are only allowed to use vacant cellular
channels. This scenario is modeled as a multi-player multi-armed bandit game
with side information, for which a distributed algorithmic solution is
proposed. The solution is a combination of no-regret learning and calibrated
forecasting, and can be applied to a broad class of multi-player stochastic
learning problems, in addition to the formulated channel selection problem.
Analytically, it is established that this approach not only yields vanishing
regret (in comparison to the global optimal solution), but also guarantees that
the empirical joint frequencies of the game converge to the set of correlated
equilibria.Comment: 31 pages (one column), 9 figure
On-line regression competitive with reproducing kernel Hilbert spaces
We consider the problem of on-line prediction of real-valued labels, assumed
bounded in absolute value by a known constant, of new objects from known
labeled objects. The prediction algorithm's performance is measured by the
squared deviation of the predictions from the actual labels. No stochastic
assumptions are made about the way the labels and objects are generated.
Instead, we are given a benchmark class of prediction rules some of which are
hoped to produce good predictions. We show that for a wide range of
infinite-dimensional benchmark classes one can construct a prediction algorithm
whose cumulative loss over the first N examples does not exceed the cumulative
loss of any prediction rule in the class plus O(sqrt(N)); the main differences
from the known results are that we do not impose any upper bound on the norm of
the considered prediction rules and that we achieve an optimal leading term in
the excess loss of our algorithm. If the benchmark class is "universal" (dense
in the class of continuous functions on each compact set), this provides an
on-line non-stochastic analogue of universally consistent prediction in
non-parametric statistics. We use two proof techniques: one is based on the
Aggregating Algorithm and the other on the recently developed method of
defensive forecasting.Comment: 37 pages, 1 figur