10 research outputs found

    Action-Constrained Markov Decision Processes With Kullback-Leibler Cost

    Full text link
    This paper concerns computation of optimal policies in which the one-step reward function contains a cost term that models Kullback-Leibler divergence with respect to nominal dynamics. This technique was introduced by Todorov in 2007, where it was shown under general conditions that the solution to the average-reward optimality equations reduce to a simple eigenvector problem. Since then many authors have sought to apply this technique to control problems and models of bounded rationality in economics. A crucial assumption is that the input process is essentially unconstrained. For example, if the nominal dynamics include randomness from nature (e.g., the impact of wind on a moving vehicle), then the optimal control solution does not respect the exogenous nature of this disturbance. This paper introduces a technique to solve a more general class of action-constrained MDPs. The main idea is to solve an entire parameterized family of MDPs, in which the parameter is a scalar weighting the one-step reward function. The approach is new and practical even in the original unconstrained formulation

    From Infinite to Finite Programs: Explicit Error Bounds with Applications to Approximate Dynamic Programming

    Full text link
    We consider linear programming (LP) problems in infinite dimensional spaces that are in general computationally intractable. Under suitable assumptions, we develop an approximation bridge from the infinite-dimensional LP to tractable finite convex programs in which the performance of the approximation is quantified explicitly. To this end, we adopt the recent developments in two areas of randomized optimization and first order methods, leading to a priori as well as a posterior performance guarantees. We illustrate the generality and implications of our theoretical results in the special case of the long-run average cost and discounted cost optimal control problems for Markov decision processes on Borel spaces. The applicability of the theoretical results is demonstrated through a constrained linear quadratic optimal control problem and a fisheries management problem.Comment: 30 pages, 5 figure

    Rate-cost tradeoffs in control

    Get PDF
    Consider a distributed control problem with a communication channel connecting the observer of a linear stochastic system to the controller. The goal of the controller is minimize a quadratic cost function. The most basic special case of that cost function is the mean-square deviation of the system state from the desired state. We study the fundamental tradeoff between the communication rate r bits/sec and the limsup of the expected cost b, and show a lower bound on the rate necessary to attain b. The bound applies as long as the system noise has a probability density function. If target cost b is not too large, that bound can be closely approached by a simple lattice quantization scheme that only quantizes the innovation, that is, the difference between the controller's belief about the current state and the true state

    Rate-Cost Tradeoffs in Control

    Get PDF
    Consider a control problem with a communication channel connecting the observer of a linear stochastic system to the controller. The goal of the controller is to minimize a quadratic cost function in the state variables and control signal, known as the linear quadratic regulator (LQR). We study the fundamental tradeoff between the communication rate r bits/sec and the expected cost b. We obtain a lower bound on a certain rate-cost function, which quantifies the minimum directed mutual information between the channel input and output that is compatible with a target LQR cost. The rate-cost function has operational significance in multiple scenarios of interest: among others, it allows us to lower-bound the minimum communication rate for fixed and variable length quantization, and for control over noisy channels. We derive an explicit lower bound to the rate-cost function, which applies to the vector, non-Gaussian, and partially observed systems, thereby extending and generalizing an earlier explicit expression for the scalar Gaussian system, due to Tatikonda el al. [2]. The bound applies as long as the differential entropy of the system noise is not −∞ . It can be closely approached by a simple lattice quantization scheme that only quantizes the innovation, that is, the difference between the controller's belief about the current state and the true state. Via a separation principle between control and communication, similar results hold for causal lossy compression of additive noise Markov sources. Apart from standard dynamic programming arguments, our technical approach leverages the Shannon lower bound, develops new estimates for data compression with coding memory, and uses some recent results on high resolution variablelength vector quantization to prove that the new converse bounds are tight

    Rate-cost tradeoffs in control

    Get PDF
    Consider a distributed control problem with a communication channel connecting the observer of a linear stochastic system to the controller. The goal of the controller is minimize a quadratic cost function. The most basic special case of that cost function is the mean-square deviation of the system state from the desired state. We study the fundamental tradeoff between the communication rate r bits/sec and the limsup of the expected cost b, and show a lower bound on the rate necessary to attain b. The bound applies as long as the system noise has a probability density function. If target cost b is not too large, that bound can be closely approached by a simple lattice quantization scheme that only quantizes the innovation, that is, the difference between the controller's belief about the current state and the true state
    corecore