4,404 research outputs found
Randomized and Relaxed Strategies in Continuous-Time Markov Decision Processes.
One of the goals of this article is to describe a wide class of control strategies, which includes the traditional relaxed strategies, as well as the so called randomized strategies which appeared earlier only in the framework of semi-Markov decision processes. If the objective is the total expected cost up to the accumulation of jumps, then without loss of generality one can consider only Markov relaxed strategies. Under a simple condition, the Markov randomized strategies are also sufficient. An example shows that the mentioned condition is important. Finally, without any conditions, the class of so called Poisson-related strategies is also sufficient in the optimization problems. All the results are applicable to the discounted model, they may be useful also for the case of long-run average cost. Read More: https://epubs.siam.org/doi/10.1137/15M101401
On gradual-impulse control of continuous-time Markov decision processes with multiplicative cost
In this paper, we consider the gradual-impulse control problem of
continuous-time Markov decision processes, where the system performance is
measured by the expectation of the exponential utility of the total cost. We
prove, under very general conditions on the system primitives, the existence of
a deterministic stationary optimal policy out of a more general class of
policies. Policies that we consider allow multiple simultaneous impulses,
randomized selection of impulses with random effects, relaxed gradual controls,
and accumulation of jumps. After characterizing the value function using the
optimality equation, we reduce the continuous-time gradual-impulse control
problem to an equivalent simple discrete-time Markov decision process, whose
action space is the union of the sets of gradual and impulsive actions
Ergodic Control and Polyhedral approaches to PageRank Optimization
We study a general class of PageRank optimization problems which consist in
finding an optimal outlink strategy for a web site subject to design
constraints. We consider both a continuous problem, in which one can choose the
intensity of a link, and a discrete one, in which in each page, there are
obligatory links, facultative links and forbidden links. We show that the
continuous problem, as well as its discrete variant when there are no
constraints coupling different pages, can both be modeled by constrained Markov
decision processes with ergodic reward, in which the webmaster determines the
transition probabilities of websurfers. Although the number of actions turns
out to be exponential, we show that an associated polytope of transition
measures has a concise representation, from which we deduce that the continuous
problem is solvable in polynomial time, and that the same is true for the
discrete problem when there are no coupling constraints. We also provide
efficient algorithms, adapted to very large networks. Then, we investigate the
qualitative features of optimal outlink strategies, and identify in particular
assumptions under which there exists a "master" page to which all controlled
pages should point. We report numerical results on fragments of the real web
graph.Comment: 39 page
Dynamic Mechanism Design: Incentive Compatibility, Profit Maximization and Information Disclosure
This paper examines the problem of how to design incentive-compatible mechanisms in environments in which the agents' private information evolves stochastically over time and in which decisions have to be made in each period. The environments we consider are fairly general in that the agents' types are allowed to evolve in a non-Markov way, decisions are allowed to affect the type distributions and payoffs are not restricted to be separable over time. Our first result is the characterization of a dynamic payoff formula that describes the evolution of the agents' equilibrium payoffs in an incentive-compatible mechanism. The formula summarizes all local first-order conditions taking into account how current information affects the dynamics of expected payoffs. The formula generalizes the familiar envelope condition from static mechanism design: the key difference is that a variation in the current types now impacts payoffs in all subsequent periods both directly and through the effect on the distributions of future types. First, we identify assumptions on the primitive environment that guarantee that our dynamic payoff formula is a necessary condition for incentive compatibility. Next, we specialize this formula to quasi-linear environments and show how it permits one to establish a dynamic "revenue-equivalence" result and to construct a formula for dynamic virtual surplus which is instrumental for the design of optimal mechanisms. We then turn to the characterization of sufficient conditions for incentive compatibility. Lastly, we show how our results can be put to work in a variety of applications that include the design of profit-maximizing dynamic auctions with AR(k) values and the provision of experience goods.dynamic mechanisms, asymmetric information, stochastic processes, incentives
Beauty Contests and "Irrational Exuberance": A Neoclassical Approach
The arrival of new, unfamiliar, investment opportunities is often associated with "exuberant" movements in asset prices and real economic activity. During these episodes of high uncertainty, financial markets look at the real sector for signals about the profitability of the new investment opportunities, and vice versa. In this paper, we study how such information spillovers impact the incentives that agents face when making their real economic decisions. On the positive front, we find that the sensitivity of equilibrium outcomes to noise and to higher-order uncertainty is amplified, exacerbating the disconnect from fundamentals. On the normative front, we find that these effects are symptoms of constrained inefficiency; we then identify policies that can improve welfare without requiring the government to have any informational advantage vis-a-vis the market. At the heart of these results is a distortion that induces a conventional neoclassical economy to behave as a Keynesian "beauty contest" and to exhibit fluctuations that may look like "irrational exuberance" to an outside observer.Incomplete information, beauty contests, exuberance, informational frictions, endogenous complementarities JEL Classification Numbers: C72, D62, D82
On reducing a constrained gradual-impulsive control problem for a jump Markov model to a model with gradual control only
In this paper we consider a gradual-impulsive control problem for continuous-time Markov decision processes (CTMDPs) with total cost criteria and constraints. We develop a simple and useful method, which reduces the concerned problem to a standard CTMDP problem with gradual control only. This allows us to derive straightforwardly and under a minimal set of conditions the optimality results (sufficient classes of control policies, as well as the existence of stationary optimal policies) for the original constrained gradual-impulsive control problem
A Linear Programming Approach to Sequential Hypothesis Testing
Under some mild Markov assumptions it is shown that the problem of designing
optimal sequential tests for two simple hypotheses can be formulated as a
linear program. The result is derived by investigating the Lagrangian dual of
the sequential testing problem, which is an unconstrained optimal stopping
problem, depending on two unknown Lagrangian multipliers. It is shown that the
derivative of the optimal cost function with respect to these multipliers
coincides with the error probabilities of the corresponding sequential test.
This property is used to formulate an optimization problem that is jointly
linear in the cost function and the Lagrangian multipliers and an be solved for
both with off-the-shelf algorithms. To illustrate the procedure, optimal
sequential tests for Gaussian random sequences with different dependency
structures are derived, including the Gaussian AR(1) process.Comment: 25 pages, 4 figures, accepted for publication in Sequential Analysi
- …