19 research outputs found

    An alternative derivation of Birkhoff’s formula for the contraction coefficient of a positive matrix

    Get PDF
    AbstractThis note concerns the projective contraction coefficient Ď„(H) of a rectangular matrix H with positive entries. A simple proof of an explicit formula for Ď„(H), originally established by [Trans. Am. Math. Soc. 85 (1957) 219], is given. The motivation for this work comes from the area of Markov decision processes, and the argument is based on elementary differential calculus

    A Characterization of the optimal risk-Sensitive average cost in finite controlled Markov chains

    Full text link
    This work concerns controlled Markov chains with finite state and action spaces. The transition law satisfies the simultaneous Doeblin condition, and the performance of a control policy is measured by the (long-run) risk-sensitive average cost criterion associated to a positive, but otherwise arbitrary, risk sensitivity coefficient. Within this context, the optimal risk-sensitive average cost is characterized via a minimization problem in a finite-dimensional Euclidean space.Comment: Published at http://dx.doi.org/10.1214/105051604000000585 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    An optimality system for finite average Markov decision chains under risk-aversion

    Get PDF
    summary:This work concerns controlled Markov chains with finite state space and compact action sets. The decision maker is risk-averse with constant risk-sensitivity, and the performance of a control policy is measured by the long-run average cost criterion. Under standard continuity-compactness conditions, it is shown that the (possibly non-constant) optimal value function is characterized by a system of optimality equations which allows to obtain an optimal stationary policy. Also, it is shown that the optimal superior and inferior limit average cost functions coincide

    Risk-sensitive Markov stopping games with an absorbing state

    Get PDF
    summary:This work is concerned with discrete-time Markov stopping games with two players. At each decision time player II can stop the game paying a terminal reward to player I, or can let the system to continue its evolution. In this latter case player I applies an action affecting the transitions and entitling him to receive a running reward from player II. It is supposed that player I has a no-null and constant risk-sensitivity coefficient, and that player II tries to minimize the utility of player I. The performance of a pair of decision strategies is measured by the risk-sensitive (expected) total reward of player I and, besides mild continuity-compactness conditions, the main structural assumption on the model is the existence of an absorbing state which is accessible from any starting point. In this context, it is shown that the value function of the game is characterized by an equilibrium equation, and the existence of a Nash equilibrium is established

    Markov stopping games with an absorbing state and total reward criterion

    Get PDF
    summary:This work is concerned with discrete-time zero-sum games with Markov transitions on a denumerable space. At each decision time player II can stop the system paying a terminal reward to player I, or can let the system to continue its evolution. If the system is not halted, player I selects an action which affects the transitions and receives a running reward from player II. Assuming the existence of an absorbing state which is accessible from any other state, the performance of a pair of decision strategies is measured by the total expected reward criterion. In this context it is shown that the value function of the game is characterized by an equilibrium equation, and the existence of a Nash equilibrium is established

    The risk-sensitive Poisson equation for a communicating Markov chain on a denumerable state space

    Get PDF
    summary:This work concerns a discrete-time Markov chain with time-invariant transition mechanism and denumerable state space, which is endowed with a nonnegative cost function with finite support. The performance of the chain is measured by the (long-run) risk-sensitive average cost and, assuming that the state space is communicating, the existence of a solution to the risk-sensitive Poisson equation is established, a result that holds even for transient chains. Also, a sufficient criterion ensuring that the functional part of a solution is uniquely determined up to an additive constant is provided, and an example is given to show that the uniqueness result may fail when that criterion is not satisfied

    Undiscounted Value Iteration in Stable Markov Decision Chains with Bounded Rewards

    No full text
    This work considers denumerable state Markov Decision Chains endowed with a long-run expected average reward criterion and bounded rewards. Apart from standard continuitycompactness restrictions, it is supposed that the Lyapunov Function Condition for bounded rewards holds true; this assumption guarantees the existence of a (possibly) unbounded solution of the optimality equation yielding optimal stationary policies. In this context, it is shown that the relative value functions and differential rewards produced by the Value Iteration method converge pointwise to the solution of the optimality equation, and that it is possible to obtain a sequence of stationary policies whose limit points are optimal. These results extend those in [17], where it was assumed that the `first error function' is bounded, and in [6], where weaker convergence results were obtained assuming that under the action of an arbitrary stationary policy the state space is a communicating class. Key words: Markov dec..
    corecore