Search CORE

106,982 research outputs found

Rat Races and Glass Ceilings: Career Paths in Organizations.

Author: Bardsley P.
Sherstyuk K.
Publication venue
Publication date
Field of study

In an ongoing organization, such as a large law parternship firm, employees are motivated not only by current rewards but also by the prospect of promotion, and the opportunity to influence policy and make the rules in the future. This leads to a dynamic programming problem in contract design. We model career design in such a firm as a recursive mechanism design problem in an overlapping generations environment.CONTRACTS ; GENERATIONS ; COSTS

Research Papers in Economics

Discounted continuous-time constrained Markov decision processes in Polish spaces

Author: Guo Xianping
Song Xinyuan
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 30/12/2011
Field of study

This paper is devoted to studying constrained continuous-time Markov decision processes (MDPs) in the class of randomized policies depending on state histories. The transition rates may be unbounded, the reward and costs are admitted to be unbounded from above and from below, and the state and action spaces are Polish spaces. The optimality criterion to be maximized is the expected discounted rewards, and the constraints can be imposed on the expected discounted costs. First, we give conditions for the nonexplosion of underlying processes and the finiteness of the expected discounted rewards/costs. Second, using a technique of occupation measures, we prove that the constrained optimality of continuous-time MDPs can be transformed to an equivalent (optimality) problem over a class of probability measures. Based on the equivalent problem and a so-called

\bar{w}

-weak convergence of probability measures developed in this paper, we show the existence of a constrained optimal policy. Third, by providing a linear programming formulation of the equivalent problem, we show the solvability of constrained optimal policies. Finally, we use two computable examples to illustrate our main results.Comment: Published in at http://dx.doi.org/10.1214/10-AAP749 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Smart Choices and the Selection Monad

Author: Abadi Martin
Plotkin Gordon
Publication venue
Publication date: 10/12/2020
Field of study

Describing systems in terms of choices and their resulting costs and rewards offers the promise of freeing algorithm designers and programmers from specifying how those choices should be made; in implementations, the choices can be realized by optimization techniques and, increasingly, by machine learning methods. We study this approach from a programming-language perspective. We define two small languages that support decision-making abstractions: one with choices and rewards, and the other additionally with probabilities. We give both operational and denotational semantics. In the case of the second language we consider three denotational semantics, with varying degrees of correlation between possible program values and expected rewards. The operational semantics combine the usual semantics of standard constructs with optimization over spaces of possible execution strategies. The denotational semantics, which are compositional and can also be viewed as an implementation by translation to a simpler language, rely on the selection monad, to handle choice, combined with an auxiliary monad, to handle other effects such as rewards or probability. We establish adequacy theorems that the two semantics coincide in all cases. We also prove full abstraction at ground types, with varying notions of observation in the probabilistic case corresponding to the various degrees of correlation. We present axioms for choice combined with rewards and probability, establishing completeness at ground types for the case of rewards without probability

arXiv.org e-Print Archive

Episciences.org

Mean-Variance Optimization in Markov Decision Processes

Author: Mannor Shie
Tsitsiklis John
Publication venue
Publication date: 29/04/2011
Field of study

We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NP-hard for some cases, and strongly NP-hard for others. We finally offer pseudopolynomial exact and approximation algorithms.Comment: A full version of an ICML 2011 pape

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT