89,057 research outputs found

    Trial-based Heuristic Tree Search for MDPs with Factored Action Spaces

    Get PDF
    MDPs with factored action spaces, i.e. where actions are described as assignments to a set of action variables, allow reasoning over action variables instead of action states, yet most algorithms only consider a grounded action representation. This includes algorithms that are instantiations of the trial-based heuristic tree search (THTS) framework, such as AO* or UCT. To be able to reason over factored action spaces, we propose a generalisation of THTS where nodes that branch over all applicable actions are replaced with subtrees that consist of nodes that represent the decision for a single action variable. We show that many THTS algorithms retain their theoretical properties under the generalised framework, and show how to approximate any state-action heuristic to a heuristic for partial action assignments. This allows to guide a UCT variant that is able to create exponentially fewer nodes than the same algorithm that considers ground actions. An empirical evaluation on the benchmark set of the probabilistic track of the latest International Planning Competition validates the benefits of the approach

    Qualitative and Quantitative Solution Diversity in Heuristic-Search and Case-Based Planning

    Get PDF
    Planning is a branch of Artificial Intelligence (AI) concerned with projecting courses of actions for executing tasks and reaching goals. AI Planning helps increase the autonomy of artificially intelligent agents and decrease the cognitive load burdening human planners working in challenging domains, such as the Mars exploration projects. Approaches to AI planning include first-principles heuristic search planning and case-based planning. The former conducts a heuristic-guided search in the solution space, while the latter generates new solutions by adapting solutions to previously-solved problems.The ability to generate not just one solution, but a set of meaningfully diverse solutions to each planning problem helps cater to a wider variety of user preferences and needs (which it may be difficult or even unfeasible to acquire and/or represent in their entirety), produce viable alternative courses of action to fall back on in case of failure, counter varied threats in intrusion detection, render computer games more compelling, and provide representative samples of the vast search spaces of planning problems.This work describes a general framework for generating diverse sets of solutions (i.e. courses of action) to planning problems. The general diversity-aware planning algorithm consists of iteratively generating solutions using a composite candidate-solution evaluation criterion taking into account both how promising the candidate solutions appear in their own right and on how likely they are to increase the overall diversity of the final set of solutions. This estimate of diversity is based on distance metrics, i.e. measures of the dissimilarity between two solutions. Distance metrics can be quantitative or qualitative.Quantitative distance measures are domain-independent. They require minimum knowledge engineering, but may not reflect dissimilarities that are truly meaningful. Qualitative distance metrics are domain-specific and reflect, based on the domain knowledge encoded within them, the kind of meaningful dissimilarities that might be identified by a person familiar with the domain.Based on the general framework for diversity-aware planning, three domain-independent planning algorithms have been implemented and are described and evaluated herein. DivFF is a diverse heuristic search planner for deterministic planning domains (i.e. domains for which the assumption is made that any action can only have one possible outcome). DivCBP is a diverse case-based planner, also for deterministic planning domains. DivNDP is a heuristic search planner for nondeterministic planning domains (i.e. domains the descriptions of which include actions with multiple possible outcomes). The experimental evaluation of the three algorithms is conducted on a computer game domain, chosen for its challenging characteristics, which include nondeterminism and dynamism. The generated courses of action are run in the game in order to ascertain whether they affect the game environment in diverse ways. This constitutes the test of their genuine diversity, which cannot be evaluated accurately based solely on their low-level structure.It is shown that all proposed planning systems successfully generate sets of diverse solutions using varied criteria for assessing solution dissimilarity. Qualitatively-diverse solution sets are demonstrated to constantly produce more diverse effects in the game environment than quantitatively-diverse solution sets.A comparison between the two planning systems for deterministic domains, DivCBP and DivFF, reveals the former to be more successful at consistently generating diverse sets of solutions. The reasons for this are investigated, thus contributing to the literature of comparative studies of first-principles and case-based planning approaches. Finally, an application of diversity in planning is showcased: simulating personality-trait variation in computer game characters. Sets of diverse solutions to both deterministic and nondeterministic planning problems are shown to successfully create diverse character behavior in the evaluation environment

    Surrogate Search As a Way to Combat Harmful Effects of Ill-behaved Evaluation Functions

    Full text link
    Recently, several researchers have found that cost-based satisficing search with A* often runs into problems. Although some "work arounds" have been proposed to ameliorate the problem, there has been little concerted effort to pinpoint its origin. In this paper, we argue that the origins of this problem can be traced back to the fact that most planners that try to optimize cost also use cost-based evaluation functions (i.e., f(n) is a cost estimate). We show that cost-based evaluation functions become ill-behaved whenever there is a wide variance in action costs; something that is all too common in planning domains. The general solution to this malady is what we call a surrogatesearch, where a surrogate evaluation function that doesn't directly track the cost objective, and is resistant to cost-variance, is used. We will discuss some compelling choices for surrogate evaluation functions that are based on size rather that cost. Of particular practical interest is a cost-sensitive version of size-based evaluation function -- where the heuristic estimates the size of cheap paths, as it provides attractive quality vs. speed tradeoffsComment: arXiv admin note: substantial text overlap with arXiv:1103.368

    Contingent task and motion planning under uncertainty for human–robot interactions

    Get PDF
    Manipulation planning under incomplete information is a highly challenging task for mobile manipulators. Uncertainty can be resolved by robot perception modules or using human knowledge in the execution process. Human operators can also collaborate with robots for the execution of some difficult actions or as helpers in sharing the task knowledge. In this scope, a contingent-based task and motion planning is proposed taking into account robot uncertainty and human–robot interactions, resulting a tree-shaped set of geometrically feasible plans. Different sorts of geometric reasoning processes are embedded inside the planner to cope with task constraints like detecting occluding objects when a robot needs to grasp an object. The proposal has been evaluated with different challenging scenarios in simulation and a real environment.Postprint (published version

    Learning relational dynamics of stochastic domains for planning

    Get PDF
    Probabilistic planners are very flexible tools that can provide good solutions for difficult tasks. However, they rely on a model of the domain, which may be costly to either hand code or automatically learn for complex tasks. We propose a new learning approach that (a) requires only a set of state transitions to learn the model; (b) can cope with uncertainty in the effects; (c) uses a relational representation to generalize over different objects; and (d) in addition to action effects, it can also learn exogenous effects that are not related to any action, e.g., moving objects, endogenous growth and natural development. The proposed learning approach combines a multi-valued variant of inductive logic programming for the generation of candidate models, with an optimization method to select the best set of planning operators to model a problem. Finally, experimental validation is provided that shows improvements over previous work.Peer ReviewedPostprint (author's final draft
    • …
    corecore