106,982 research outputs found
Rat Races and Glass Ceilings: Career Paths in Organizations.
In an ongoing organization, such as a large law parternship firm, employees are motivated not only by current rewards but also by the prospect of promotion, and the opportunity to influence policy and make the rules in the future. This leads to a dynamic programming problem in contract design. We model career design in such a firm as a recursive mechanism design problem in an overlapping generations environment.CONTRACTS ; GENERATIONS ; COSTS
Discounted continuous-time constrained Markov decision processes in Polish spaces
This paper is devoted to studying constrained continuous-time Markov decision
processes (MDPs) in the class of randomized policies depending on state
histories. The transition rates may be unbounded, the reward and costs are
admitted to be unbounded from above and from below, and the state and action
spaces are Polish spaces. The optimality criterion to be maximized is the
expected discounted rewards, and the constraints can be imposed on the expected
discounted costs. First, we give conditions for the nonexplosion of underlying
processes and the finiteness of the expected discounted rewards/costs. Second,
using a technique of occupation measures, we prove that the constrained
optimality of continuous-time MDPs can be transformed to an equivalent
(optimality) problem over a class of probability measures. Based on the
equivalent problem and a so-called -weak convergence of probability
measures developed in this paper, we show the existence of a constrained
optimal policy. Third, by providing a linear programming formulation of the
equivalent problem, we show the solvability of constrained optimal policies.
Finally, we use two computable examples to illustrate our main results.Comment: Published in at http://dx.doi.org/10.1214/10-AAP749 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Smart Choices and the Selection Monad
Describing systems in terms of choices and their resulting costs and rewards
offers the promise of freeing algorithm designers and programmers from
specifying how those choices should be made; in implementations, the choices
can be realized by optimization techniques and, increasingly, by machine
learning methods. We study this approach from a programming-language
perspective. We define two small languages that support decision-making
abstractions: one with choices and rewards, and the other additionally with
probabilities. We give both operational and denotational semantics.
In the case of the second language we consider three denotational semantics,
with varying degrees of correlation between possible program values and
expected rewards. The operational semantics combine the usual semantics of
standard constructs with optimization over spaces of possible execution
strategies.
The denotational semantics, which are compositional and can also be viewed as
an implementation by translation to a simpler language, rely on the selection
monad, to handle choice, combined with an auxiliary monad, to handle other
effects such as rewards or probability.
We establish adequacy theorems that the two semantics coincide in all cases.
We also prove full abstraction at ground types, with varying notions of
observation in the probabilistic case corresponding to the various degrees of
correlation. We present axioms for choice combined with rewards and
probability, establishing completeness at ground types for the case of rewards
without probability
Mean-Variance Optimization in Markov Decision Processes
We consider finite horizon Markov decision processes under performance
measures that involve both the mean and the variance of the cumulative reward.
We show that either randomized or history-based policies can improve
performance. We prove that the complexity of computing a policy that maximizes
the mean reward under a variance constraint is NP-hard for some cases, and
strongly NP-hard for others. We finally offer pseudopolynomial exact and
approximation algorithms.Comment: A full version of an ICML 2011 pape
- …