10,332 research outputs found

    Trading Safety Versus Performance: Rapid Deployment of Robotic Swarms with Robust Performance Constraints

    Full text link
    In this paper we consider a stochastic deployment problem, where a robotic swarm is tasked with the objective of positioning at least one robot at each of a set of pre-assigned targets while meeting a temporal deadline. Travel times and failure rates are stochastic but related, inasmuch as failure rates increase with speed. To maximize chances of success while meeting the deadline, a control strategy has therefore to balance safety and performance. Our approach is to cast the problem within the theory of constrained Markov Decision Processes, whereby we seek to compute policies that maximize the probability of successful deployment while ensuring that the expected duration of the task is bounded by a given deadline. To account for uncertainties in the problem parameters, we consider a robust formulation and we propose efficient solution algorithms, which are of independent interest. Numerical experiments confirming our theoretical results are presented and discussed

    Model uncertainty and monetary policy

    Get PDF
    Model uncertainty has the potential to change importantly how monetary policy should be conducted, making it an issue that central banks cannot ignore. In this paper, I use a standard new Keynesian business cycle model to analyze the behavior of a central bank that conducts policy with discretion while fearing that its model is misspecified. I begin by showing how to solve linear-quadratic robust Markov-perfect Stackelberg problems where the leader fears that private agents form expectations using the misspecified model. Next, I exploit the connection between robust control and uncertainty aversion to present and interpret my results in terms of the distorted beliefs held by the central bank, households, and firms. My main results are as follows. First, the central bank's pessimism leads it to forecast future outcomes using an expectations operator that, relative to rational expectations, assigns greater probability to extreme inflation and consumption outcomes. Second, the central bank's skepticism about its model causes it to move forcefully to stabilize inflation following shocks. Finally, even in the absence of misspecification, policy loss can be improved if the central bank implements a robust policy.Monetary policy

    Joint Design and Separation Principle for Opportunistic Spectrum Access in the Presence of Sensing Errors

    Full text link
    We address the design of opportunistic spectrum access (OSA) strategies that allow secondary users to independently search for and exploit instantaneous spectrum availability. Integrated in the joint design are three basic components: a spectrum sensor that identifies spectrum opportunities, a sensing strategy that determines which channels in the spectrum to sense, and an access strategy that decides whether to access based on imperfect sensing outcomes. We formulate the joint PHY-MAC design of OSA as a constrained partially observable Markov decision process (POMDP). Constrained POMDPs generally require randomized policies to achieve optimality, which are often intractable. By exploiting the rich structure of the underlying problem, we establish a separation principle for the joint design of OSA. This separation principle reveals the optimality of myopic policies for the design of the spectrum sensor and the access strategy, leading to closed-form optimal solutions. Furthermore, decoupling the design of the sensing strategy from that of the spectrum sensor and the access strategy, the separation principle reduces the constrained POMDP to an unconstrained one, which admits deterministic optimal policies. Numerical examples are provided to study the design tradeoffs, the interaction between the spectrum sensor and the sensing and access strategies, and the robustness of the ensuing design to model mismatch.Comment: 43 pages, 10 figures, submitted to IEEE Transactions on Information Theory in Feb. 200

    Risk Aversion in Finite Markov Decision Processes Using Total Cost Criteria and Average Value at Risk

    Full text link
    In this paper we present an algorithm to compute risk averse policies in Markov Decision Processes (MDP) when the total cost criterion is used together with the average value at risk (AVaR) metric. Risk averse policies are needed when large deviations from the expected behavior may have detrimental effects, and conventional MDP algorithms usually ignore this aspect. We provide conditions for the structure of the underlying MDP ensuring that approximations for the exact problem can be derived and solved efficiently. Our findings are novel inasmuch as average value at risk has not previously been considered in association with the total cost criterion. Our method is demonstrated in a rapid deployment scenario, whereby a robot is tasked with the objective of reaching a target location within a temporal deadline where increased speed is associated with increased probability of failure. We demonstrate that the proposed algorithm not only produces a risk averse policy reducing the probability of exceeding the expected temporal deadline, but also provides the statistical distribution of costs, thus offering a valuable analysis tool

    Discounted continuous-time constrained Markov decision processes in Polish spaces

    Full text link
    This paper is devoted to studying constrained continuous-time Markov decision processes (MDPs) in the class of randomized policies depending on state histories. The transition rates may be unbounded, the reward and costs are admitted to be unbounded from above and from below, and the state and action spaces are Polish spaces. The optimality criterion to be maximized is the expected discounted rewards, and the constraints can be imposed on the expected discounted costs. First, we give conditions for the nonexplosion of underlying processes and the finiteness of the expected discounted rewards/costs. Second, using a technique of occupation measures, we prove that the constrained optimality of continuous-time MDPs can be transformed to an equivalent (optimality) problem over a class of probability measures. Based on the equivalent problem and a so-called wˉ\bar{w}-weak convergence of probability measures developed in this paper, we show the existence of a constrained optimal policy. Third, by providing a linear programming formulation of the equivalent problem, we show the solvability of constrained optimal policies. Finally, we use two computable examples to illustrate our main results.Comment: Published in at http://dx.doi.org/10.1214/10-AAP749 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore