Search CORE

19 research outputs found

Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs

Author: Cavazos-Cadena Rolando
Publication venue: Institute of Information Theory and Automation AS CR
Publication date: 01/01/1989
Field of study

Institute of Mathematics AS CR, v. v. i.

An alternative derivation of Birkhoff’s formula for the contraction coefficient of a positive matrix

Author: Cavazos-Cadena Rolando
Publication venue: Elsevier Inc.
Publication date: 01/12/2003
Field of study

AbstractThis note concerns the projective contraction coefficient τ(H) of a rectangular matrix H with positive entries. A simple proof of an explicit formula for τ(H), originally established by [Trans. Am. Math. Soc. 85 (1957) 219], is given. The motivation for this work comes from the area of Markov decision processes, and the argument is based on elementary differential calculus

Elsevier - Publisher Connector

A Characterization of the optimal risk-Sensitive average cost in finite controlled Markov chains

Author: Cavazos-Cadena Rolando
Hernandez-Hernandez Daniel
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 23/03/2005
Field of study

This work concerns controlled Markov chains with finite state and action spaces. The transition law satisfies the simultaneous Doeblin condition, and the performance of a control policy is measured by the (long-run) risk-sensitive average cost criterion associated to a positive, but otherwise arbitrary, risk sensitivity coefficient. Within this context, the optimal risk-sensitive average cost is characterized via a minimization problem in a finite-dimensional Euclidean space.Comment: Published at http://dx.doi.org/10.1214/105051604000000585 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

An optimality system for finite average Markov decision chains under risk-aversion

Author: Alanís-Durán Alfredo
Cavazos-Cadena Rolando
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2012
Field of study

summary:This work concerns controlled Markov chains with finite state space and compact action sets. The decision maker is risk-averse with constant risk-sensitivity, and the performance of a control policy is measured by the long-run average cost criterion. Under standard continuity-compactness conditions, it is shown that the (possibly non-constant) optimal value function is characterized by a system of optimality equations which allows to obtain an optimal stationary policy. Also, it is shown that the optimal superior and inferior limit average cost functions coincide

Institute of Mathematics AS CR, v. v. i.

Risk-sensitive Markov stopping games with an absorbing state

Author: Cavazos-Cadena Rolando
Cruz-Suárez Hugo
López-Rivero Jaicer
Publication venue: 'Institute of Information Theory and Automation'
Publication date: 01/01/2022
Field of study

summary:This work is concerned with discrete-time Markov stopping games with two players. At each decision time player II can stop the game paying a terminal reward to player I, or can let the system to continue its evolution. In this latter case player I applies an action affecting the transitions and entitling him to receive a running reward from player II. It is supposed that player I has a no-null and constant risk-sensitivity coefficient, and that player II tries to minimize the utility of player I. The performance of a pair of decision strategies is measured by the risk-sensitive (expected) total reward of player I and, besides mild continuity-compactness conditions, the main structural assumption on the model is the existence of an absorbing state which is accessible from any starting point. In this context, it is shown that the value function of the game is characterized by an equilibrium equation, and the existence of a Nash equilibrium is established

Institute of Mathematics AS CR, v. v. i.

Markov stopping games with an absorbing state and total reward criterion

Author: Cavazos-Cadena Rolando
Rodríguez-Gutiérrez Luis
Sánchez-Guillermo Dulce María
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2021
Field of study

summary:This work is concerned with discrete-time zero-sum games with Markov transitions on a denumerable space. At each decision time player II can stop the system paying a terminal reward to player I, or can let the system to continue its evolution. If the system is not halted, player I selects an action which affects the transitions and receives a running reward from player II. Assuming the existence of an absorbing state which is accessible from any other state, the performance of a pair of decision strategies is measured by the total expected reward criterion. In this context it is shown that the value function of the game is characterized by an equilibrium equation, and the existence of a Nash equilibrium is established

Institute of Mathematics AS CR, v. v. i.

The risk-sensitive Poisson equation for a communicating Markov chain on a denumerable state space

Author: Cavazos-Cadena Rolando
Publication venue: Institute of Information Theory and Automation AS CR
Publication date: 01/01/2009
Field of study

summary:This work concerns a discrete-time Markov chain with time-invariant transition mechanism and denumerable state space, which is endowed with a nonnegative cost function with finite support. The performance of the chain is measured by the (long-run) risk-sensitive average cost and, assuming that the state space is communicating, the existence of a solution to the risk-sensitive Poisson equation is established, a result that holds even for transient chains. Also, a sufficient criterion ensuring that the functional part of a solution is uniquely determined up to an additive constant is provided, and an example is given to show that the uniqueness result may fail when that criterion is not satisfied

Institute of Mathematics AS CR, v. v. i.

Solution to the optimality equation in a class of Markov decision chains with the average cost criterion

Author: Cavazos-Cadena Rolando
Publication venue: Institute of Information Theory and Automation AS CR
Publication date: 01/01/1991
Field of study

Institute of Mathematics AS CR, v. v. i.

Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs

Author: Cavazos-Cadena Rolando
Publication venue: Institute of Information Theory and Automation AS CR
Publication date: 01/01/1989
Field of study

Institute of Mathematics AS CR, v. v. i.

Undiscounted Value Iteration in Stable Markov Decision Chains with Bounded Rewards

Author: Rolando Cavazos-Cadena
Publication venue
Publication date
Field of study

This work considers denumerable state Markov Decision Chains endowed with a long-run expected average reward criterion and bounded rewards. Apart from standard continuitycompactness restrictions, it is supposed that the Lyapunov Function Condition for bounded rewards holds true; this assumption guarantees the existence of a (possibly) unbounded solution of the optimality equation yielding optimal stationary policies. In this context, it is shown that the relative value functions and differential rewards produced by the Value Iteration method converge pointwise to the solution of the optimality equation, and that it is possible to obtain a sequence of stationary policies whose limit points are optimal. These results extend those in [17], where it was assumed that the `first error function' is bounded, and in [6], where weaker convergence results were obtained assuming that under the action of an arbitrary stationary policy the state space is a communicating class. Key words: Markov dec..

CiteSeerX