10,332 research outputs found
Trading Safety Versus Performance: Rapid Deployment of Robotic Swarms with Robust Performance Constraints
In this paper we consider a stochastic deployment problem, where a robotic
swarm is tasked with the objective of positioning at least one robot at each of
a set of pre-assigned targets while meeting a temporal deadline. Travel times
and failure rates are stochastic but related, inasmuch as failure rates
increase with speed. To maximize chances of success while meeting the deadline,
a control strategy has therefore to balance safety and performance. Our
approach is to cast the problem within the theory of constrained Markov
Decision Processes, whereby we seek to compute policies that maximize the
probability of successful deployment while ensuring that the expected duration
of the task is bounded by a given deadline. To account for uncertainties in the
problem parameters, we consider a robust formulation and we propose efficient
solution algorithms, which are of independent interest. Numerical experiments
confirming our theoretical results are presented and discussed
Model uncertainty and monetary policy
Model uncertainty has the potential to change importantly how monetary policy should be conducted, making it an issue that central banks cannot ignore. In this paper, I use a standard new Keynesian business cycle model to analyze the behavior of a central bank that conducts policy with discretion while fearing that its model is misspecified. I begin by showing how to solve linear-quadratic robust Markov-perfect Stackelberg problems where the leader fears that private agents form expectations using the misspecified model. Next, I exploit the connection between robust control and uncertainty aversion to present and interpret my results in terms of the distorted beliefs held by the central bank, households, and firms. My main results are as follows. First, the central bank's pessimism leads it to forecast future outcomes using an expectations operator that, relative to rational expectations, assigns greater probability to extreme inflation and consumption outcomes. Second, the central bank's skepticism about its model causes it to move forcefully to stabilize inflation following shocks. Finally, even in the absence of misspecification, policy loss can be improved if the central bank implements a robust policy.Monetary policy
Joint Design and Separation Principle for Opportunistic Spectrum Access in the Presence of Sensing Errors
We address the design of opportunistic spectrum access (OSA) strategies that
allow secondary users to independently search for and exploit instantaneous
spectrum availability. Integrated in the joint design are three basic
components: a spectrum sensor that identifies spectrum opportunities, a sensing
strategy that determines which channels in the spectrum to sense, and an access
strategy that decides whether to access based on imperfect sensing outcomes.
We formulate the joint PHY-MAC design of OSA as a constrained partially
observable Markov decision process (POMDP). Constrained POMDPs generally
require randomized policies to achieve optimality, which are often intractable.
By exploiting the rich structure of the underlying problem, we establish a
separation principle for the joint design of OSA. This separation principle
reveals the optimality of myopic policies for the design of the spectrum sensor
and the access strategy, leading to closed-form optimal solutions. Furthermore,
decoupling the design of the sensing strategy from that of the spectrum sensor
and the access strategy, the separation principle reduces the constrained POMDP
to an unconstrained one, which admits deterministic optimal policies. Numerical
examples are provided to study the design tradeoffs, the interaction between
the spectrum sensor and the sensing and access strategies, and the robustness
of the ensuing design to model mismatch.Comment: 43 pages, 10 figures, submitted to IEEE Transactions on Information
Theory in Feb. 200
Risk Aversion in Finite Markov Decision Processes Using Total Cost Criteria and Average Value at Risk
In this paper we present an algorithm to compute risk averse policies in
Markov Decision Processes (MDP) when the total cost criterion is used together
with the average value at risk (AVaR) metric. Risk averse policies are needed
when large deviations from the expected behavior may have detrimental effects,
and conventional MDP algorithms usually ignore this aspect. We provide
conditions for the structure of the underlying MDP ensuring that approximations
for the exact problem can be derived and solved efficiently. Our findings are
novel inasmuch as average value at risk has not previously been considered in
association with the total cost criterion. Our method is demonstrated in a
rapid deployment scenario, whereby a robot is tasked with the objective of
reaching a target location within a temporal deadline where increased speed is
associated with increased probability of failure. We demonstrate that the
proposed algorithm not only produces a risk averse policy reducing the
probability of exceeding the expected temporal deadline, but also provides the
statistical distribution of costs, thus offering a valuable analysis tool
Discounted continuous-time constrained Markov decision processes in Polish spaces
This paper is devoted to studying constrained continuous-time Markov decision
processes (MDPs) in the class of randomized policies depending on state
histories. The transition rates may be unbounded, the reward and costs are
admitted to be unbounded from above and from below, and the state and action
spaces are Polish spaces. The optimality criterion to be maximized is the
expected discounted rewards, and the constraints can be imposed on the expected
discounted costs. First, we give conditions for the nonexplosion of underlying
processes and the finiteness of the expected discounted rewards/costs. Second,
using a technique of occupation measures, we prove that the constrained
optimality of continuous-time MDPs can be transformed to an equivalent
(optimality) problem over a class of probability measures. Based on the
equivalent problem and a so-called -weak convergence of probability
measures developed in this paper, we show the existence of a constrained
optimal policy. Third, by providing a linear programming formulation of the
equivalent problem, we show the solvability of constrained optimal policies.
Finally, we use two computable examples to illustrate our main results.Comment: Published in at http://dx.doi.org/10.1214/10-AAP749 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …