Search CORE

696 research outputs found

Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games

Author: Farina Gabriele
Kroer Christian
Sandholm Tuomas
Publication venue
Publication date: 09/09/2018
Field of study

Regret minimization is a powerful tool for solving large-scale extensive-form games. State-of-the-art methods rely on minimizing regret locally at each decision point. In this work we derive a new framework for regret minimization on sequential decision problems and extensive-form games with general compact convex sets at each decision point and general convex losses, as opposed to prior work which has been for simplex decision points and linear losses. We call our framework laminar regret decomposition. It generalizes the CFR algorithm to this more general setting. Furthermore, our framework enables a new proof of CFR even in the known setting, which is derived from a perspective of decomposing polytope regret, thereby leading to an arguably simpler interpretation of the algorithm. Our generalization to convex compact sets and convex losses allows us to develop new algorithms for several problems: regularized sequential decision making, regularized Nash equilibria in extensive-form games, and computing approximate extensive-form perfect equilibria. Our generalization also leads to the first regret-minimization algorithm for computing reduced-normal-form quantal response equilibria based on minimizing local regrets. Experiments show that our framework leads to algorithms that scale at a rate comparable to the fastest variants of counterfactual regret minimization for computing Nash equilibrium, and therefore our approach leads to the first algorithm for computing quantal response equilibria in extremely large games. Finally we show that our framework enables a new kind of scalable opponent exploitation approach

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Scalable First-Order Methods for Robust MDPs

Author: Grand-Clément Julien
Kroer Christian
Publication venue
Publication date: 14/01/2021
Field of study

Robust Markov Decision Processes (MDPs) are a powerful framework for modeling sequential decision-making problems with model uncertainty. This paper proposes the first first-order framework for solving robust MDPs. Our algorithm interleaves primal-dual first-order updates with approximate Value Iteration updates. By carefully controlling the tradeoff between the accuracy and cost of Value Iteration updates, we achieve an ergodic convergence rate of

O \left( A^{2} S^{3}\log(S)\log(\epsilon^{-1}) \epsilon^{-1} \right)

for the best choice of parameters on ellipsoidal and Kullback-Leibler

s

-rectangular uncertainty sets, where

S

and

A

is the number of states and actions, respectively. Our dependence on the number of states and actions is significantly better (by a factor of

O(A^{1.5}S^{1.5})

) than that of pure Value Iteration algorithms. In numerical experiments on ellipsoidal uncertainty sets we show that our algorithm is significantly more scalable than state-of-the-art approaches. Our framework is also the first one to solve robust MDPs with

s

-rectangular KL uncertainty sets

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Projection methods in conic optimization

Author: A Lewis
A Lewis
A. Caprara
BT Polyak
CA Micchelli
D Henrion
F Alizadeh
F Deutsch
F Lin
H Qi
I Dukanovic
J Bolte
J Douglas
J Malick
J Malick
J Malick
J Nie
J Nocedal
J Povh
J Sun
J-B Hiriart-Urruty
J-B Lasserre
J-B Lasserre
JCh Gilbert
K Toh
L Lovász
LQ Qi
M Goemans
M Kočvara
M Kočvara
M Nouralishahi
MF Anjos
N Higham
N Higham
N Schwertman
R Borsdorf
R Correa
RH Tutuncu
RH Tutuncu
RL Dykstra
RT Rockafellar
RT Rockafellar
S Al-Homidan
S Burer
Y Gao
Y Nesterov
Publication venue
Publication date: 01/01/2011
Field of study

There exist efficient algorithms to project a point onto the intersection of a convex cone and an affine subspace. Those conic projections are in turn the work-horse of a range of algorithms in conic optimization, having a variety of applications in science, finance and engineering. This chapter reviews some of these algorithms, emphasizing the so-called regularization algorithms for linear conic optimization, and applications in polynomial optimization. This is a presentation of the material of several recent research articles; we aim here at clarifying the ideas, presenting them in a general framework, and pointing out important techniques

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

HAL-INSA Toulouse

Data-driven satisficing measure and ranking

Author: Huang Wenjie
Publication venue
Publication date: 01/07/2018
Field of study

We propose an computational framework for real-time risk assessment and prioritizing for random outcomes without prior information on probability distributions. The basic model is built based on satisficing measure (SM) which yields a single index for risk comparison. Since SM is a dual representation for a family of risk measures, we consider problems constrained by general convex risk measures and specifically by Conditional value-at-risk. Starting from offline optimization, we apply sample average approximation technique and argue the convergence rate and validation of optimal solutions. In online stochastic optimization case, we develop primal-dual stochastic approximation algorithms respectively for general risk constrained problems, and derive their regret bounds. For both offline and online cases, we illustrate the relationship between risk ranking accuracy with sample size (or iterations).Comment: 26 Pages, 6 Figure

arXiv.org e-Print Archive

ScholarBank@NUS

International Conference on Continuous Optimization (ICCOPT) 2019 Conference Book

Author: Arndt Rafael
Hintermüller Michael
Huber Olivier
Löbhard Caroline
Stengl Steven-Marian
Publication venue
Publication date: 01/01/2019
Field of study

The Sixth International Conference on Continuous Optimization took place on the campus of the Technical University of Berlin, August 3-8, 2019. The ICCOPT is a flagship conference of the Mathematical Optimization Society (MOS), organized every three years. ICCOPT 2019 was hosted by the Weierstrass Institute for Applied Analysis and Stochastics (WIAS) Berlin. It included a Summer School and a Conference with a series of plenary and semi-plenary talks, organized and contributed sessions, and poster sessions. This book comprises the full conference program. It contains, in particular, the scientific program in survey style as well as with all details, and information on the social program, the venue, special meetings, and more

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics

Convex-Concave Min-Max Stackelberg Games

Author: Goktas Denizalp
Greenwald Amy
Publication venue
Publication date: 10/11/2021
Field of study

Min-max optimization problems (i.e., min-max games) have been attracting a great deal of attention because of their applicability to a wide range of machine learning problems. Although significant progress has been made recently, the literature to date has focused on games with independent strategy sets; little is known about solving games with dependent strategy sets, which can be characterized as min-max Stackelberg games. We introduce two first-order methods that solve a large class of convex-concave min-max Stackelberg games, and show that our methods converge in polynomial time. Min-max Stackelberg games were first studied by Wald, under the posthumous name of Wald's maximin model, a variant of which is the main paradigm used in robust optimization, which means that our methods can likewise solve many convex robust optimization problems. We observe that the computation of competitive equilibria in Fisher markets also comprises a min-max Stackelberg game. Further, we demonstrate the efficacy and efficiency of our algorithms in practice by computing competitive equilibria in Fisher markets with varying utility structures. Our experiments suggest potential ways to extend our theoretical results, by demonstrating how different smoothness properties can affect the convergence rate of our algorithms.Comment: 25 pages, 4 tables, 1 figure, Forthcoming in NeurIPS 202

arXiv.org e-Print Archive

Optimization and Applications

Author
Publication venue: Oberwolfach-Walke : Mathematisches Forschungsinstitut Oberwolfach
Publication date: 01/01/2002
Field of study

[no abstract available

Repositorium für Naturwissenschaften und Technik