    Stochastic Game Theory: Adjustment to Equilibrium Under Noisy Directional Learning

    This paper presents a dynamic model in which agents adjust their decisions in the direction of higher payoffs, subject to random error. This process produces a probability distribution of players' decisions whose evolution over time is determined by the Fokker-Planck equation. The dynamic process is stable for all potential games, a class of payoff structures that includes several widely studied games. In equilibrium, the distributions that determine expected payoffs correspond to the distributions that arise from the logit function applied to those expected payoffs. This "logit equilibrium" forms a stochastic generalization of the Nash equilibrium and provides a possible explanation of anomalous laboratory data.bounded rationality, noisy directional learning, Fokker- Planck equation, potential games, logit equilibrium, stochastic potential.

    Riemannian game dynamics

    We study a class of evolutionary game dynamics defined by balancing a gain determined by the game's payoffs against a cost of motion that captures the difficulty with which the population moves between states. Costs of motion are represented by a Riemannian metric, i.e., a state-dependent inner product on the set of population states. The replicator dynamics and the (Euclidean) projection dynamics are the archetypal examples of the class we study. Like these representative dynamics, all Riemannian game dynamics satisfy certain basic desiderata, including positive correlation and global convergence in potential games. Moreover, when the underlying Riemannian metric satisfies a Hessian integrability condition, the resulting dynamics preserve many further properties of the replicator and projection dynamics. We examine the close connections between Hessian game dynamics and reinforcement learning in normal form games, extending and elucidating a well-known link between the replicator dynamics and exponential reinforcement learning.Comment: 47 pages, 12 figures; added figures and further simplified the derivation of the dynamic

    The Master Equation for Large Population Equilibriums

    We use a simple N-player stochastic game with idiosyncratic and common noises to introduce the concept of Master Equation originally proposed by Lions in his lectures at the Coll\`ege de France. Controlling the limit N tends to the infinity of the explicit solution of the N-player game, we highlight the stochastic nature of the limit distributions of the states of the players due to the fact that the random environment does not average out in the limit, and we recast the Mean Field Game (MFG) paradigm in a set of coupled Stochastic Partial Differential Equations (SPDEs). The first one is a forward stochastic Kolmogorov equation giving the evolution of the conditional distributions of the states of the players given the common noise. The second is a form of stochastic Hamilton Jacobi Bellman (HJB) equation providing the solution of the optimization problem when the flow of conditional distributions is given. Being highly coupled, the system reads as an infinite dimensional Forward Backward Stochastic Differential Equation (FBSDE). Uniqueness of a solution and its Markov property lead to the representation of the solution of the backward equation (i.e. the value function of the stochastic HJB equation) as a deterministic function of the solution of the forward Kolmogorov equation, function which is usually called the decoupling field of the FBSDE. The (infinite dimensional) PDE satisfied by this decoupling field is identified with the \textit{master equation}. We also show that this equation can be derived for other large populations equilibriums like those given by the optimal control of McKean-Vlasov stochastic differential equations. The paper is written more in the style of a review than a technical paper, and we spend more time and energy motivating and explaining the probabilistic interpretation of the Master Equation, than identifying the most general set of assumptions under which our claims are true

    Compressed Sensing over â„“p\ell_p-balls: Minimax Mean Square Error

    We consider the compressed sensing problem, where the object x_0 \in \bR^N is to be recovered from incomplete measurements y=Ax0+zy = Ax_0 + z; here the sensing matrix AA is an n×Nn \times N random matrix with iid Gaussian entries and n<Nn < N. A popular method of sparsity-promoting reconstruction is ℓ1\ell^1-penalized least-squares reconstruction (aka LASSO, Basis Pursuit). It is currently popular to consider the strict sparsity model, where the object x0x_0 is nonzero in only a small fraction of entries. In this paper, we instead consider the much more broadly applicable ℓp\ell_p-sparsity model, where x0x_0 is sparse in the sense of having ℓp\ell_p norm bounded by ξ⋅N1/p\xi \cdot N^{1/p} for some fixed 000 0. We study an asymptotic regime in which nn and NN both tend to infinity with limiting ratio n/N=δ∈(0,1)n/N = \delta \in (0,1), both in the noisy (z≠0z \neq 0) and noiseless (z=0z=0) cases. Under weak assumptions on x0x_0, we are able to precisely evaluate the worst-case asymptotic minimax mean-squared reconstruction error (AMSE) for ℓ1\ell^1 penalized least-squares: min over penalization parameters, max over ℓp\ell_p-sparse objects x0x_0. We exhibit the asymptotically least-favorable object (hardest sparse signal to recover) and the maximin penalization. Our explicit formulas unexpectedly involve quantities appearing classically in statistical decision theory. Occurring in the present setting, they reflect a deeper connection between penalized ℓ1\ell^1 minimization and scalar soft thresholding. This connection, which follows from earlier work of the authors and collaborators on the AMP iterative thresholding algorithm, is carefully explained. Our approach also gives precise results under weak-ℓp\ell_p ball coefficient constraints, as we show here.Comment: 41 pages, 11 pdf figure

    From Black-Scholes to Online Learning: Dynamic Hedging under Adversarial Environments

    Full text link
    We consider a non-stochastic online learning approach to price financial options by modeling the market dynamic as a repeated game between the nature (adversary) and the investor. We demonstrate that such framework yields analogous structure as the Black-Scholes model, the widely popular option pricing model in stochastic finance, for both European and American options with convex payoffs. In the case of non-convex options, we construct approximate pricing algorithms, and demonstrate that their efficiency can be analyzed through the introduction of an artificial probability measure, in parallel to the so-called risk-neutral measure in the finance literature, even though our framework is completely adversarial. Continuous-time convergence results and extensions to incorporate price jumps are also presented


