55,343 research outputs found

    Discrete-Time Control with Non-Constant Discount Factor

    Get PDF
    This paper deals with discrete-time Markov decision processes (MDPs) with Borel state and action spaces, and total expected discounted cost optimality criterion. We assume that the discount factor is not constant: it may depend on the state and action; moreover, it can even take the extreme values zero or one. We propose sufficient conditions on the data of the model ensuring the existence of optimal control policies and allowing the characterization of the optimal value function as a solution to the dynamic programming equation. As a particular case of these MDPs with varying discount factor, we study MDPs with stopping, as well as the corresponding optimal stopping times and contact set. We show applications to switching MDPs models and, in particular, we study a pollution accumulation problem

    Power-efficient dynamic quantization for multisensor HMM state estimation over fading channels

    Get PDF
    In this paper, we address the problem of designing power efficient quantizers for state estimation of hidden Markov models using multiple sensors communicating to a fusion centre via error-prone randomly time-varying flat fading channels modelled by finite state Markov chains. Our objective is to minimize a tradeoff between the long term average of mean square estimation error and expected total power consumption. We formulate the problem as a stochastic control problem by using Markov decision processes. Under some mild assumption on the measurement noise at the sensors, the discretized action space (quantization thresholds and transmission power levels) version of the optimization problem forms a unichain Markov decision process for stationary policies. The solution to the discretized problem provides optimal quantization thresholds and power levels to be communicated back to the sensors via a feedback channel. Moreover, in order to improve the performance of the quantization system, we employ a gradient- free stochastic optimization technique to determine the optimal set of quantization thresholds from which optimal quantization levels are determined. The performance results for estimation error/total transmission power tradeoff are studied under various channel conditions and sensor measurement qualities

    Control of Time-Varying Epidemic-Like Stochastic Processes and Their Mean-Field Limits

    Full text link
    The optimal control of epidemic-like stochastic processes is important both historically and for emerging applications today, where it can be especially important to include time-varying parameters that impact viral epidemic-like propagation. We connect the control of such stochastic processes with time-varying behavior to the stochastic shortest path problem and obtain solutions for various cost functions. Then, under a mean-field scaling, this general class of stochastic processes is shown to converge to a corresponding dynamical system. We analogously establish that the optimal control of this class of processes converges to the optimal control of the limiting dynamical system. Consequently, we study the optimal control of the dynamical system where the comparison of both controlled systems renders various important mathematical properties of interest.Comment: arXiv admin note: substantial text overlap with arXiv:1709.0798
    • …
    corecore