Search CORE

10 research outputs found

Action-Constrained Markov Decision Processes With Kullback-Leibler Cost

Author: Bušić Ana
Meyn Sean
Publication venue
Publication date: 06/07/2018
Field of study

This paper concerns computation of optimal policies in which the one-step reward function contains a cost term that models Kullback-Leibler divergence with respect to nominal dynamics. This technique was introduced by Todorov in 2007, where it was shown under general conditions that the solution to the average-reward optimality equations reduce to a simple eigenvector problem. Since then many authors have sought to apply this technique to control problems and models of bounded rationality in economics. A crucial assumption is that the input process is essentially unconstrained. For example, if the nominal dynamics include randomness from nature (e.g., the impact of wind on a moving vehicle), then the optimal control solution does not respect the exogenous nature of this disturbance. This paper introduces a technique to solve a more general class of action-constrained MDPs. The main idea is to solve an entire parameterized family of MDPs, in which the parameter is a scalar weighting the one-step reward function. The approach is new and practical even in the original unconstrained formulation

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

From Infinite to Finite Programs: Explicit Error Bounds with Applications to Approximate Dynamic Programming

Author: Esfahani Peyman Mohajerin
Kuhn Daniel
Lygeros John
Sutter Tobias
Publication venue
Publication date: 20/02/2017
Field of study

We consider linear programming (LP) problems in infinite dimensional spaces that are in general computationally intractable. Under suitable assumptions, we develop an approximation bridge from the infinite-dimensional LP to tractable finite convex programs in which the performance of the approximation is quantified explicitly. To this end, we adopt the recent developments in two areas of randomized optimization and first order methods, leading to a priori as well as a posterior performance guarantees. We illustrate the generality and implications of our theoretical results in the special case of the long-run average cost and discounted cost optimal control problems for Markov decision processes on Borel spaces. The applicability of the theoretical results is demonstrated through a constrained linear quadratic optimal control problem and a fisheries management problem.Comment: 30 pages, 5 figure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Rate-cost tradeoffs in control

Author: Hassibi Babak
Kostina Victoria
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2016
Field of study

Consider a distributed control problem with a communication channel connecting the observer of a linear stochastic system to the controller. The goal of the controller is minimize a quadratic cost function. The most basic special case of that cost function is the mean-square deviation of the system state from the desired state. We study the fundamental tradeoff between the communication rate r bits/sec and the limsup of the expected cost b, and show a lower bound on the rate necessary to attain b. The bound applies as long as the system noise has a probability density function. If target cost b is not too large, that bound can be closely approached by a simple lattice quantization scheme that only quantizes the innovation, that is, the difference between the controller's belief about the current state and the true state

Crossref

Caltech Authors

Rate-Cost Tradeoffs in Control

Author: Hassibi Babak
Kostina Victoria
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/11/2018
Field of study

Consider a control problem with a communication channel connecting the observer of a linear stochastic system to the controller. The goal of the controller is to minimize a quadratic cost function in the state variables and control signal, known as the linear quadratic regulator (LQR). We study the fundamental tradeoff between the communication rate r bits/sec and the expected cost b. We obtain a lower bound on a certain rate-cost function, which quantifies the minimum directed mutual information between the channel input and output that is compatible with a target LQR cost. The rate-cost function has operational significance in multiple scenarios of interest: among others, it allows us to lower-bound the minimum communication rate for fixed and variable length quantization, and for control over noisy channels. We derive an explicit lower bound to the rate-cost function, which applies to the vector, non-Gaussian, and partially observed systems, thereby extending and generalizing an earlier explicit expression for the scalar Gaussian system, due to Tatikonda el al. [2]. The bound applies as long as the differential entropy of the system noise is not −∞ . It can be closely approached by a simple lattice quantization scheme that only quantizes the innovation, that is, the difference between the controller's belief about the current state and the true state. Via a separation principle between control and communication, similar results hold for causal lossy compression of additive noise Markov sources. Apart from standard dynamic programming arguments, our technical approach leverages the Shannon lower bound, develops new estimates for data compression with coding memory, and uses some recent results on high resolution variablelength vector quantization to prove that the new converse bounds are tight

arXiv.org e-Print Archive

Caltech Authors

Rate-cost tradeoffs in control

Author: Hassibi Babak
Kostina Victoria
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2016
Field of study

Rationally Inattentive Control of Markov Processes

Author: Borkar V. S.
Borkar V. S.
Csiszár I.
Ehsan Shafieepoorfard
Maxim Raginsky
Meyn S. P.
Sean P. Meyn
Shafieepoorfard E.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date
Field of study

Crossref