696 research outputs found
Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games
Regret minimization is a powerful tool for solving large-scale extensive-form
games. State-of-the-art methods rely on minimizing regret locally at each
decision point. In this work we derive a new framework for regret minimization
on sequential decision problems and extensive-form games with general compact
convex sets at each decision point and general convex losses, as opposed to
prior work which has been for simplex decision points and linear losses. We
call our framework laminar regret decomposition. It generalizes the CFR
algorithm to this more general setting. Furthermore, our framework enables a
new proof of CFR even in the known setting, which is derived from a perspective
of decomposing polytope regret, thereby leading to an arguably simpler
interpretation of the algorithm. Our generalization to convex compact sets and
convex losses allows us to develop new algorithms for several problems:
regularized sequential decision making, regularized Nash equilibria in
extensive-form games, and computing approximate extensive-form perfect
equilibria. Our generalization also leads to the first regret-minimization
algorithm for computing reduced-normal-form quantal response equilibria based
on minimizing local regrets. Experiments show that our framework leads to
algorithms that scale at a rate comparable to the fastest variants of
counterfactual regret minimization for computing Nash equilibrium, and
therefore our approach leads to the first algorithm for computing quantal
response equilibria in extremely large games. Finally we show that our
framework enables a new kind of scalable opponent exploitation approach
Scalable First-Order Methods for Robust MDPs
Robust Markov Decision Processes (MDPs) are a powerful framework for modeling
sequential decision-making problems with model uncertainty. This paper proposes
the first first-order framework for solving robust MDPs. Our algorithm
interleaves primal-dual first-order updates with approximate Value Iteration
updates. By carefully controlling the tradeoff between the accuracy and cost of
Value Iteration updates, we achieve an ergodic convergence rate of for the best
choice of parameters on ellipsoidal and Kullback-Leibler -rectangular
uncertainty sets, where and is the number of states and actions,
respectively. Our dependence on the number of states and actions is
significantly better (by a factor of ) than that of pure
Value Iteration algorithms. In numerical experiments on ellipsoidal uncertainty
sets we show that our algorithm is significantly more scalable than
state-of-the-art approaches. Our framework is also the first one to solve
robust MDPs with -rectangular KL uncertainty sets
Projection methods in conic optimization
There exist efficient algorithms to project a point onto the intersection of
a convex cone and an affine subspace. Those conic projections are in turn the
work-horse of a range of algorithms in conic optimization, having a variety of
applications in science, finance and engineering. This chapter reviews some of
these algorithms, emphasizing the so-called regularization algorithms for
linear conic optimization, and applications in polynomial optimization. This is
a presentation of the material of several recent research articles; we aim here
at clarifying the ideas, presenting them in a general framework, and pointing
out important techniques
Data-driven satisficing measure and ranking
We propose an computational framework for real-time risk assessment and
prioritizing for random outcomes without prior information on probability
distributions. The basic model is built based on satisficing measure (SM) which
yields a single index for risk comparison. Since SM is a dual representation
for a family of risk measures, we consider problems constrained by general
convex risk measures and specifically by Conditional value-at-risk. Starting
from offline optimization, we apply sample average approximation technique and
argue the convergence rate and validation of optimal solutions. In online
stochastic optimization case, we develop primal-dual stochastic approximation
algorithms respectively for general risk constrained problems, and derive their
regret bounds. For both offline and online cases, we illustrate the
relationship between risk ranking accuracy with sample size (or iterations).Comment: 26 Pages, 6 Figure
International Conference on Continuous Optimization (ICCOPT) 2019 Conference Book
The Sixth International Conference on Continuous Optimization took place on the campus of the Technical University of Berlin, August 3-8, 2019. The ICCOPT is a flagship conference of the Mathematical Optimization Society (MOS), organized every three years. ICCOPT 2019 was hosted by the Weierstrass Institute for Applied Analysis and Stochastics (WIAS) Berlin. It included a Summer School and a Conference with a series of plenary and semi-plenary talks, organized and contributed sessions, and poster sessions.
This book comprises the full conference program. It contains, in particular, the scientific program in survey style as well as with all details, and information on the social program, the venue, special meetings, and more
Convex-Concave Min-Max Stackelberg Games
Min-max optimization problems (i.e., min-max games) have been attracting a
great deal of attention because of their applicability to a wide range of
machine learning problems. Although significant progress has been made
recently, the literature to date has focused on games with independent strategy
sets; little is known about solving games with dependent strategy sets, which
can be characterized as min-max Stackelberg games. We introduce two first-order
methods that solve a large class of convex-concave min-max Stackelberg games,
and show that our methods converge in polynomial time. Min-max Stackelberg
games were first studied by Wald, under the posthumous name of Wald's maximin
model, a variant of which is the main paradigm used in robust optimization,
which means that our methods can likewise solve many convex robust optimization
problems. We observe that the computation of competitive equilibria in Fisher
markets also comprises a min-max Stackelberg game. Further, we demonstrate the
efficacy and efficiency of our algorithms in practice by computing competitive
equilibria in Fisher markets with varying utility structures. Our experiments
suggest potential ways to extend our theoretical results, by demonstrating how
different smoothness properties can affect the convergence rate of our
algorithms.Comment: 25 pages, 4 tables, 1 figure, Forthcoming in NeurIPS 202
- …