Search CORE

189 research outputs found

Computing the smallest fixed point of order-preserving nonexpansive mappings arising in positive stochastic games and static analysis of programs

Author: Adjé
Adjé
Akian
Akian
Akian
Akian
Aliprantis
Assalé Adjé
Costan
Cousot
Cousot
Eric Goubault
Esparza
Filar
Gaubert
Gaubert
Gawlitza
Gawlitza
Gawlitza
Gunawardena
Leroux
Leroux
Maitra
Mallet-Paret
Neyman
Nussbaum
Nussbaum
Olsder
Ovchinnikov
Quincampoix
Rockafellar
Rosenberg
Sorin
Stéphane Gaubert
Vigeral
Publication venue: 'Elsevier BV'
Publication date: 06/08/2013
Field of study

The problem of computing the smallest fixed point of an order-preserving map arises in the study of zero-sum positive stochastic games. It also arises in static analysis of programs by abstract interpretation. In this context, the discount rate may be negative. We characterize the minimality of a fixed point in terms of the nonlinear spectral radius of a certain semidifferential. We apply this characterization to design a policy iteration algorithm, which applies to the case of finite state and action spaces. The algorithm returns a locally minimal fixed point, which turns out to be globally minimal when the discount rate is nonnegative.Comment: 26 pages, 3 figures. We add new results, improvements and two examples of positive stochastic games. Note that an initial version of the paper has appeared in the proceedings of the Eighteenth International Symposium on Mathematical Theory of Networks and Systems (MTNS2008), Blacksburg, Virginia, July 200

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL-CEA

HAL-Polytechnique

The Lions-Mercier splitting algorithm and the alternating direction method are instances of the proximal point algorithm

Author
Publication venue: Massachusetts Institute of Technology, Operations Research Center, Laboratory for Information and Decision Systems, Intelligent Control Systems
Publication date: 01/01/1988
Field of study

Cover title.Includes bibliographical references.Supported by the Army Research Office. DAAL03-86-K-0171by Johnathan Eckstein

DSpace@MIT

Accelerating Value Iteration with Anchoring

Author: Lee Jongmin
Ryu Ernest K.
Publication venue
Publication date: 28/10/2023
Field of study

Value Iteration (VI) is foundational to the theory and practice of modern reinforcement learning, and it is known to converge at a

\mathcal{O}(\gamma^k)

-rate, where

\gamma

is the discount factor. Surprisingly, however, the optimal rate for the VI setup was not known, and finding a general acceleration mechanism has been an open problem. In this paper, we present the first accelerated VI for both the Bellman consistency and optimality operators. Our method, called Anc-VI, is based on an \emph{anchoring} mechanism (distinct from Nesterov's acceleration), and it reduces the Bellman error faster than standard VI. In particular, Anc-VI exhibits a

\mathcal{O}(1/k)

-rate for

\gamma\approx 1

or even

\gamma=1

, while standard VI has rate

\mathcal{O}(1)

for

\gamma\ge 1-1/k

, where

k

is the iteration count. We also provide a complexity lower bound matching the upper bound up to a constant factor of

4

, thereby establishing optimality of the accelerated rate of Anc-VI. Finally, we show that the anchoring mechanism provides the same benefit in the approximate VI and Gauss--Seidel VI setups as well

arXiv.org e-Print Archive

Fitted Value Function Iteration With Probability One Contractions

Author: Jenö Pál
John Stachurski
Publication venue
Publication date
Field of study

This paper studies a value function iteration algorithm that can be applied to almost all stationary dynamic programming problems. Using nonexpansive function approximation and Monte Carlo integration, we develop a randomized fitted Bellman operator and a corresponding algorithm that is globally convergent with probability one. When additional restrictions are imposed, an OP(n-1/2) rate of convergence for Monte Carlo error is obtained.

Research Papers in Economics

Tropical polyhedra are equivalent to mean payoff games

Author: Akian M.
ALEXANDER GUTERMAN
Allamigeon X.
Einsiedler M.
Filar J. A.
Gondran M.
Itenberg I.
Joswig M.
Mallet-Paret J.
MARIANNE AKIAN
STÉPHANE GAUBERT
Vincent J. M.
Zimmermann K.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 09/06/2011
Field of study

We show that several decision problems originating from max-plus or tropical convexity are equivalent to zero-sum two player game problems. In particular, we set up an equivalence between the external representation of tropical convex sets and zero-sum stochastic games, in which tropical polyhedra correspond to deterministic games with finite action spaces. Then, we show that the winning initial positions can be determined from the associated tropical polyhedron. We obtain as a corollary a game theoretical proof of the fact that the tropical rank of a matrix, defined as the maximal size of a submatrix for which the optimal assignment problem has a unique solution, coincides with the maximal number of rows (or columns) of the matrix which are linearly independent in the tropical sense. Our proofs rely on techniques from non-linear Perron-Frobenius theory.Comment: 28 pages, 5 figures; v2: updated references, added background materials and illustrations; v3: minor improvements, references update

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Institute of Mathematics AS CR, v. v. i.

HAL-Polytechnique

Proxomal point algorithm in mathematical programming

Author: Spingarn Jonathan E.
Publication venue: Georgia Institute of Technology
Publication date: 01/01/1985
Field of study

Issued as Progress report, and Final report, Project no. G-37-61

Scholarly Materials And Research @ Georgia Tech

The Operator Approach to Entropy Games

Author: Akian Marianne
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th Symposium on Theoretical Aspects of Computer Science (STACS 2017)
Publication date: 01/01/2017
Field of study

Entropy games and matrix multiplication games have been recently introduced by Asarin et al. They model the situation in which one player (Despot) wishes to minimize the growth rate of a matrix product, whereas the other player (Tribune) wishes to maximize it. We develop an operator approach to entropy games. This allows us to show that entropy games can be cast as stochastic mean payoff games in which some action spaces are simplices and payments are given by a relative entropy (Kullback-Leibler divergence). In this way, we show that entropy games with a fixed number of states belonging to Despot can be solved in polynomial time. This approach also allows us to solve these games by a policy iteration algorithm, which we compare with the spectral simplex algorithm developed by Protasov

Dagstuhl Research Online Publication Server

Convergence Analysis and Improvements for Projection Algorithms and Splitting Methods

Author: Fält Mattias
Publication venue: Department of Automatic Control, Lund University
Publication date: 02/02/2021
Field of study

Non-smooth convex optimization problems occur in all fields of engineering. A common approach to solving this class of problems is proximal algorithms, or splitting methods. These first-order optimization algorithms are often simple, well suited to solve large-scale problems and have a low computational cost per iteration. Essentially, they encode the solution to an optimization problem as a fixed point of some operator, and iterating this operator eventually results in convergence to an optimal point. However, as for other first order methods, the convergence rate is heavily dependent on the conditioning of the problem. Even though the per-iteration cost is usually low, the number of iterations can become prohibitively large for ill-conditioned problems, especially if a high accuracy solution is sought.In this thesis, a few methods for alleviating this slow convergence are studied, which can be divided into two main approaches. The first are heuristic methods that can be applied to a range of fixed-point algorithms. They are based on understanding typical behavior of these algorithms. While these methods are shown to converge, they come with no guarantees on improved convergence rates.The other approach studies the theoretical rates of a class of projection methods that are used to solve convex feasibility problems. These are problems where the goal is to find a point in the intersection of two, or possibly more, convex sets. A study of how the parameters in the algorithm affect the theoretical convergence rate is presented, as well as how they can be chosen to optimize this rate

Lund University Publications