Location of Repository

## MS&amp;E 336 Lecture 14: Approachability and regret minimization

### Abstract

In this lecture we use Blackwell’s approachability theorem to formulate both external and internal regret minimizing algorithms. Our study is based primarily on the algorithms presented by Hart and Mas-Colell [6, 7]; see also [3] for a summary. Throughout the lecture we consider a finite two-player game, where each player i has a finite pure action set Ai; let A = ∏ i Ai, and let A−i = ∏ j=i Aj. We let ai denote a pure action for player i, and let si ∈ ∆(Ai) denote a mixed action for player i. We will typically view si as a vector in RAi, with si(ai) equal to the probability that player i places on ai. We let Πi(a) denote the payoff to player i when the composite pure action vector is a, and by an abuse of notation also let Πi(s) denote the expected payoff to player i when the composite mixed action vector is s. The game is played repeatedly by the players. We let hT = (a0,...,a T −1) denote the history up to time T. The external regret of player i against action si after history hT is: ERi(h T ∑T −1;si) = Πi(si,a t −i) − Πi(a t i,a t −i). The internal regret of player i of action ai against action a ′ i after history h T is: t=0 IRi(h T;ai,a ′ ∑T −1 i) = I{a t i = ai} ( Πi(a ′ i,a t −i) − Πi(ai,a t −i) ). t=0 We let p T i ∈ ∆(Ai) denote the marginal empirical distribution of player i’s play up to time T: p T i (ai) =

Year: 2007
OAI identifier: oai:CiteSeerX.psu:10.1.1.352.6068
Provided by: CiteSeerX