55,343 research outputs found
Discrete-Time Control with Non-Constant Discount Factor
This paper deals with discrete-time Markov decision processes (MDPs) with Borel state and action spaces, and total expected discounted cost optimality criterion. We assume that the discount factor is not constant: it may depend on the state and action; moreover, it can even take the extreme values zero or one. We propose sufficient conditions on the data of the model ensuring the existence of optimal control policies and allowing the characterization of the optimal value function as a solution to the dynamic programming equation. As a particular case of these MDPs with varying discount factor, we study MDPs with stopping, as well as the corresponding optimal stopping times and contact set. We show applications to switching MDPs models and, in particular, we study a pollution accumulation problem
Power-efficient dynamic quantization for multisensor HMM state estimation over fading channels
In this paper, we address the problem of designing power efficient quantizers for state estimation of hidden Markov models using multiple sensors communicating to a fusion centre via error-prone randomly time-varying flat fading channels modelled by finite state Markov chains. Our objective is to minimize a tradeoff between the long term average of mean square estimation error and expected total power consumption. We formulate the problem as a stochastic control problem by using Markov decision processes. Under some mild assumption on the measurement noise at the sensors, the discretized action space (quantization thresholds and transmission power levels) version of the optimization problem forms a unichain Markov decision process for stationary policies. The solution to the discretized problem provides optimal quantization thresholds and power levels to be communicated back to the sensors via a feedback channel. Moreover, in order to improve the performance of the quantization system, we employ a gradient- free stochastic optimization technique to determine the optimal set of quantization thresholds from which optimal quantization levels are determined. The performance results for estimation error/total transmission power tradeoff are studied under various channel conditions and sensor measurement qualities
Control of Time-Varying Epidemic-Like Stochastic Processes and Their Mean-Field Limits
The optimal control of epidemic-like stochastic processes is important both
historically and for emerging applications today, where it can be especially
important to include time-varying parameters that impact viral epidemic-like
propagation. We connect the control of such stochastic processes with
time-varying behavior to the stochastic shortest path problem and obtain
solutions for various cost functions. Then, under a mean-field scaling, this
general class of stochastic processes is shown to converge to a corresponding
dynamical system. We analogously establish that the optimal control of this
class of processes converges to the optimal control of the limiting dynamical
system. Consequently, we study the optimal control of the dynamical system
where the comparison of both controlled systems renders various important
mathematical properties of interest.Comment: arXiv admin note: substantial text overlap with arXiv:1709.0798
- …