10 research outputs found

    Iteration Algorithms in Markov Decision Processes with State-Action-Dependent Discount Factors and Unbounded Costs

    Get PDF
    This chapter concerns discrete time Markov decision processes under a discounted optimality criterion with state-action-dependent discount factors, possibly unbounded costs, and noncompact admissible action sets. Under mild conditions, we show the existence of stationary optimal policies and we introduce the value iteration and the policy iteration algorithms to approximate the value function

    Empirical approximation in Markov games under unbounded payoff: discounted and average criteria

    Get PDF
    summary:This work deals with a class of discrete-time zero-sum Markov games whose state process {xt}\left\{ x_{t}\right\} evolves according to the equation xt+1=F(xt,at,bt,ξt), x_{t+1}=F(x_{t},a_{t},b_{t},\xi _{t}), where ata_{t} and btb_{t} represent the actions of player 1 and 2, respectively, and {ξt}\left\{ \xi _{t}\right\} is a sequence of independent and identically distributed random variables with unknown distribution θ\theta. Assuming possibly unbounded payoff, and using the empirical distribution to estimate θ\theta, we introduce approximation schemes for the value of the game as well as for optimal strategies considering both, discounted and average criteria

    Partially observable queueing systems with controlled service rates under a discounted optimality criterion

    Get PDF
    summary:We are concerned with a class of GI/GI/1GI/GI/1 queueing systems with controlled service rates, in which the waiting times are only observed when they take zero value. Applying a suitable filtering process, we show the existence of optimal control policies under a discounted optimality criterion

    Partially observable Markov decision processes with partially observable random discount factors

    Get PDF
    summary:This paper deals with a class of partially observable discounted Markov decision processes defined on Borel state and action spaces, under unbounded one-stage cost. The discount rate is a stochastic process evolving according to a difference equation, which is also assumed to be partially observable. Introducing a suitable control model and filtering processes, we prove the existence of optimal control policies. In addition, we illustrate our results in a class of GI/GI/1 queueing systems where we obtain explicitly the corresponding optimality equation and the filtering process

    Zero-sum discrete-time Markov games with unknown disturbance distribution: discounted and average criteria

    No full text
    This SpringerBrief deals with a class of discrete-time zero-sum Markov games with Borel state and action spaces, and possibly unbounded payoffs, under discounted and average criteria, whose state process evolves according to a stochastic difference equation. The corresponding disturbance process is an observable sequence of independent and identically distributed random variables with unknown distribution for both players. Unlike the standard case, the game is played over an infinite horizon evolving as follows. At each stage, once the players have observed the state of the game, and before choosing the actions, players 1 and 2 implement a statistical estimation process to obtain estimates of the unknown distribution. Then, independently, the players adapt their decisions to such estimators to select their actions and construct their strategies. This book presents a systematic analysis on recent developments in this kind of games. Specifically, the theoretical foundations on the procedures combining statistical estimation and control techniques for the construction of strategies of the players are introduced, with illustrative examples. In this sense, the book is an essential reference for theoretical and applied researchers in the fields of stochastic control and game theory, and their applications

    Approximation and estimation in Markov control processes under a discounted criterion

    Get PDF
    summary:We consider a class of discrete-time Markov control processes with Borel state and action spaces, and k\Re ^{k}-valued i.i.d. disturbances with unknown density ρ.\rho . Supposing possibly unbounded costs, we combine suitable density estimation methods of ρ\rho with approximation procedures of the optimal cost function, to show the existence of a sequence {f^t}\lbrace \hat{f}_{t}\rbrace of minimizers converging to an optimal stationary policy $f_{\infty }.

    Approximation, estimation and control of stochastic systems under a randomized discounted cost criterion

    Get PDF
    summary:The paper deals with a class of discrete-time stochastic control processes under a discounted optimality criterion with random discount rate, and possibly unbounded costs. The state process {xt}\left\{ x_{t}\right\} and the discount process {αt}\left\{ \alpha _{t}\right\} evolve according to the coupled difference equations xt+1=F(xt,αt,at,ξt),x_{t+1}=F(x_{t},\alpha _{t},a_{t},\xi _{t}), αt+1=G(αt,ηt) \alpha _{t+1}=G(\alpha _{t},\eta _{t}) where the state and discount disturbance processes {ξt}\{\xi _{t}\} and {ηt}\{\eta _{t}\} are sequences of i.i.d. random variables with densities ρξ\rho ^{\xi } and ρη\rho ^{\eta } respectively. The main objective is to introduce approximation algorithms of the optimal cost function that lead up to construction of optimal or nearly optimal policies in the cases when the densities ρξ\rho ^{\xi } and ρη\rho ^{\eta } are either known or unknown. In the latter case, we combine suitable estimation methods with control procedures to construct an asymptotically discounted optimal policy
    corecore