181 research outputs found
Metareasoning for Planning Under Uncertainty
The conventional model for online planning under uncertainty assumes that an
agent can stop and plan without incurring costs for the time spent planning.
However, planning time is not free in most real-world settings. For example, an
autonomous drone is subject to nature's forces, like gravity, even while it
thinks, and must either pay a price for counteracting these forces to stay in
place, or grapple with the state change caused by acquiescing to them. Policy
optimization in these settings requires metareasoning---a process that trades
off the cost of planning and the potential policy improvement that can be
achieved. We formalize and analyze the metareasoning problem for Markov
Decision Processes (MDPs). Our work subsumes previously studied special cases
of metareasoning and shows that in the general case, metareasoning is at most
polynomially harder than solving MDPs with any given algorithm that disregards
the cost of thinking. For reasons we discuss, optimal general metareasoning
turns out to be impractical, motivating approximations. We present approximate
metareasoning procedures which rely on special properties of the BRTDP planning
algorithm and explore the effectiveness of our methods on a variety of
problems.Comment: Extended version of IJCAI 2015 pape
Definition and Complexity of Some Basic Metareasoning Problems
In most real-world settings, due to limited time or other resources, an agent
cannot perform all potentially useful deliberation and information gathering
actions. This leads to the metareasoning problem of selecting such actions.
Decision-theoretic methods for metareasoning have been studied in AI, but there
are few theoretical results on the complexity of metareasoning.
We derive hardness results for three settings which most real metareasoning
systems would have to encompass as special cases. In the first, the agent has
to decide how to allocate its deliberation time across anytime algorithms
running on different problem instances. We show this to be
-complete. In the second, the agent has to (dynamically) allocate
its deliberation or information gathering resources across multiple actions
that it has to choose among. We show this to be -hard even when
evaluating each individual action is extremely simple. In the third, the agent
has to (dynamically) choose a limited number of deliberation or information
gathering actions to disambiguate the state of the world. We show that this is
-hard under a natural restriction, and -hard in
general
Rational Deployment of CSP Heuristics
Heuristics are crucial tools in decreasing search effort in varied fields of
AI. In order to be effective, a heuristic must be efficient to compute, as well
as provide useful information to the search algorithm. However, some well-known
heuristics which do well in reducing backtracking are so heavy that the gain of
deploying them in a search algorithm might be outweighed by their overhead.
We propose a rational metareasoning approach to decide when to deploy
heuristics, using CSP backtracking search as a case study. In particular, a
value of information approach is taken to adaptive deployment of solution-count
estimation heuristics for value ordering. Empirical results show that indeed
the proposed mechanism successfully balances the tradeoff between decreasing
backtracking and heuristic computational overhead, resulting in a significant
overall search time reduction.Comment: 7 pages, 2 figures, to appear in IJCAI-2011, http://www.ijcai.org
Metareasoning for Heuristic Search Using Uncertainty
Heuristic search methods are widely used in many real-world autonomous systems. Yet, people always want to solve search problems that are larger than time allows. To address these challenging problems, even suboptimally, a planning agent should be smart enough to intelligently allocate its computational resources, to think carefully about where in the state space it should spend time searching. For finding optimal solutions, we must examine every node that is not provably too expensive. In contrast, to find suboptimal solutions when under time pressure, we need to be very selective about which nodes to examine. In this dissertation, we will demonstrate that estimates of uncertainty, represented as belief distributions, can be used to drive search effectively. This type of algorithmic approach is known as metareasoning, which refers to reasoning about which reasoning to do. We will provide examples of improved algorithms for real-time search, bounded-cost search, and situated planning
Metareasoning for Heuristic Search Using Uncertainty
Heuristic search methods are widely used in many real-world autonomous systems. Yet, people always want to solve search problems that are larger than time allows. To address these challenging problems, even suboptimally, a planning agent should be smart enough to intelligently allocate its computational resources, to think carefully about where in the state space it should spend time searching. For finding optimal solutions, we must examine every node that is not provably too expensive. In contrast, to find suboptimal solutions when under time pressure, we need to be very selective about which nodes to examine. In this dissertation, we will demonstrate that estimates of uncertainty, represented as belief distributions, can be used to drive search effectively. This type of algorithmic approach is known as metareasoning, which refers to reasoning about which reasoning to do. We will provide examples of improved algorithms for real-time search, bounded-cost search, and situated planning
Recommended from our members
Metareasoning for Planning and Execution in Autonomous Systems
Metareasoning is the process by which an autonomous system optimizes, specifically monitors and controls, its own planning and execution processes in order to operate more effectively in its environment. As autonomous systems rapidly grow in sophistication and autonomy, the need for metareasoning has become critical for efficient and reliable operation in noisy, stochastic, unstructured domains for long periods of time. This is due to the uncertainty over the limitations of their reasoning capabilities and the range of their potential circumstances. However, despite considerable progress in metareasoning as a whole over the last thirty years, work on metareasoning for planning relies on several assumptions that diminish its accuracy and practical utility in autonomous systems that operate in the real world while work on metareasoning for execution has not seen much attention yet. This dissertation therefore proposes more effective metareasoning for planning while expanding the scope of metareasoning to execution to improve the efficiency of planning and the reliability of execution in autonomous systems.
In particular, we offer a two-pronged framework that introduces metareasoning for efficient planning and reliable execution in autonomous systems. We begin by proposing two forms of metareasoning for efficient planning: (1) a method that determines when to interrupt an anytime algorithm and act on the current solution by using online performance prediction and (2) a method that tunes the hyperparameters of the anytime algorithm at runtime by using deep reinforcement learning. We then propose two forms of metareasoning for reliable execution: (3) a method that recovers from exceptions that can be encountered during operation by using belief space planning and (4) a method that maintains and restores safety during operation by using probabilistic planning
- …