3,906 research outputs found

    Risk-sensitive average optimality in Markov decision processes

    Get PDF
    summary:In this note attention is focused on finding policies optimizing risk-sensitive optimality criteria in Markov decision chains. To this end we assume that the total reward generated by the Markov process is evaluated by an exponential utility function with a given risk-sensitive coefficient. The ratio of the first two moments depends on the value of the risk-sensitive coefficient; if the risk-sensitive coefficient is equal to zero we speak on risk-neutral models. Observe that the first moment of the generated reward corresponds to the expectation of the total reward and the second central moment of the reward variance. For communicating Markov processes and for some specific classes of unichain processes long run risk-sensitive average reward is independent of the starting state. In this note we present necessary and sufficient condition for existence of optimal policies independent of the starting state in unichain models and characterize the class of average risk-sensitive optimal policies

    Continuous-time Markov decision processes under the risk-sensitive average cost criterion

    Full text link
    This paper studies continuous-time Markov decision processes under the risk-sensitive average cost criterion. The state space is a finite set, the action space is a Borel space, the cost and transition rates are bounded, and the risk-sensitivity coefficient can take arbitrary positive real numbers. Under the mild conditions, we develop a new approach to establish the existence of a solution to the risk-sensitive average cost optimality equation and obtain the existence of an optimal deterministic stationary policy.Comment: 14 page

    A Characterization of the optimal risk-Sensitive average cost in finite controlled Markov chains

    Full text link
    This work concerns controlled Markov chains with finite state and action spaces. The transition law satisfies the simultaneous Doeblin condition, and the performance of a control policy is measured by the (long-run) risk-sensitive average cost criterion associated to a positive, but otherwise arbitrary, risk sensitivity coefficient. Within this context, the optimal risk-sensitive average cost is characterized via a minimization problem in a finite-dimensional Euclidean space.Comment: Published at http://dx.doi.org/10.1214/105051604000000585 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On gradual-impulse control of continuous-time Markov decision processes with multiplicative cost

    Full text link
    In this paper, we consider the gradual-impulse control problem of continuous-time Markov decision processes, where the system performance is measured by the expectation of the exponential utility of the total cost. We prove, under very general conditions on the system primitives, the existence of a deterministic stationary optimal policy out of a more general class of policies. Policies that we consider allow multiple simultaneous impulses, randomized selection of impulses with random effects, relaxed gradual controls, and accumulation of jumps. After characterizing the value function using the optimality equation, we reduce the continuous-time gradual-impulse control problem to an equivalent simple discrete-time Markov decision process, whose action space is the union of the sets of gradual and impulsive actions
    corecore