3,926 research outputs found
Asymptotic Expansions for Stationary Distributions of Perturbed Semi-Markov Processes
New algorithms for computing of asymptotic expansions for stationary
distributions of nonlinearly perturbed semi-Markov processes are presented. The
algorithms are based on special techniques of sequential phase space reduction,
which can be applied to processes with asymptotically coupled and uncoupled
finite phase spaces.Comment: 83 page
Algorithms for CVaR Optimization in MDPs
In many sequential decision-making problems we may want to manage risk by
minimizing some measure of variability in costs in addition to minimizing a
standard criterion. Conditional value-at-risk (CVaR) is a relatively new risk
measure that addresses some of the shortcomings of the well-known
variance-related risk measures, and because of its computational efficiencies
has gained popularity in finance and operations research. In this paper, we
consider the mean-CVaR optimization problem in MDPs. We first derive a formula
for computing the gradient of this risk-sensitive objective function. We then
devise policy gradient and actor-critic algorithms that each uses a specific
method to estimate this gradient and updates the policy parameters in the
descent direction. We establish the convergence of our algorithms to locally
risk-sensitive optimal policies. Finally, we demonstrate the usefulness of our
algorithms in an optimal stopping problem.Comment: Submitted to NIPS 1
- …