1,040 research outputs found
Probabilistic contingent planning based on HTN for high-quality plans
Deterministic planning assumes that the planning evolves along a fully
predictable path, and therefore it loses the practical value in most real
projections. A more realistic view is that planning ought to take into
consideration partial observability beforehand and aim for a more flexible and
robust solution. What is more significant, it is inevitable that the quality of
plan varies dramatically in the partially observable environment. In this paper
we propose a probabilistic contingent Hierarchical Task Network (HTN) planner,
named High-Quality Contingent Planner (HQCP), to generate high-quality plans in
the partially observable environment. The formalisms in HTN planning are
extended into partial observability and are evaluated regarding the cost. Next,
we explore a novel heuristic for high-quality plans and develop the integrated
planning algorithm. Finally, an empirical study verifies the effectiveness and
efficiency of the planner both in probabilistic contingent planning and for
obtaining high-quality plans.Comment: 10 pages, 1 figur
Sensor Synthesis for POMDPs with Reachability Objectives
Partially observable Markov decision processes (POMDPs) are widely used in
probabilistic planning problems in which an agent interacts with an environment
using noisy and imprecise sensors. We study a setting in which the sensors are
only partially defined and the goal is to synthesize "weakest" additional
sensors, such that in the resulting POMDP, there is a small-memory policy for
the agent that almost-surely (with probability~1) satisfies a reachability
objective. We show that the problem is NP-complete, and present a symbolic
algorithm by encoding the problem into SAT instances. We illustrate trade-offs
between the amount of memory of the policy and the number of additional sensors
on a simple example. We have implemented our approach and consider three
classical POMDP examples from the literature, and show that in all the examples
the number of sensors can be significantly decreased (as compared to the
existing solutions in the literature) without increasing the complexity of the
policies.Comment: arXiv admin note: text overlap with arXiv:1511.0845
Contingent task and motion planning under uncertainty for human–robot interactions
Manipulation planning under incomplete information is a highly challenging task for mobile manipulators. Uncertainty can be resolved by robot perception modules or using human knowledge in the execution process. Human operators can also collaborate with robots for the execution of some difficult actions or as helpers in sharing the task knowledge. In this scope, a contingent-based task and motion planning is proposed taking into account robot uncertainty and human–robot interactions, resulting a tree-shaped set of geometrically feasible plans. Different sorts of geometric reasoning processes are embedded inside the planner to cope with task constraints like detecting occluding objects when a robot needs to grasp an object. The proposal has been evaluated with different challenging scenarios in simulation and a real environment.Postprint (published version
IST Austria Technical Report
POMDPs are standard models for probabilistic planning problems, where an agent interacts with an uncertain environment. We study the problem of almost-sure reachability, where given a set of target states, the question is to decide whether there is a policy to ensure that the target set is reached with probability 1 (almost-surely). While in general the problem is EXPTIME-complete, in many practical cases policies with a small amount of memory suffice. Moreover, the existing solution to the problem is explicit, which first requires to construct explicitly an exponential reduction to a belief-support MDP. In this work, we first study the existence of observation-stationary strategies, which is NP-complete, and then small-memory strategies. We present a symbolic algorithm by an efficient encoding to SAT and using a SAT solver for the problem. We report experimental results demonstrating the scalability of our symbolic (SAT-based) approach
Optimal Planning with State Constraints
In the classical planning model, state variables are assigned
values in the initial state and remain unchanged unless
explicitly affected by action effects. However, some properties
of states are more naturally modelled not as direct effects of
actions but instead as derived, in each state, from the primary
variables via a set of rules. We refer to those rules as state
constraints. The two types of state constraints that will be
discussed here are numeric state constraints and logical rules
that we will refer to as axioms.
When using state constraints we make a distinction between
primary variables, whose values are directly affected by action
effects, and secondary variables, whose values are determined by
state constraints. While primary variables have finite and
discrete domains, as in classical planning, there is no such
requirement for secondary variables. For example, using numeric
state constraints allows us to have secondary variables whose
values are real numbers. We show that state constraints are a
construct that lets us combine classical planning methods with
specialised solvers developed for other types of problems. For
example, introducing numeric state constraints enables us to
apply planning techniques in domains involving interconnected
physical systems, such as power networks.
To solve these types of problems optimally, we adapt commonly
used methods from optimal classical planning, namely state-space
search guided by admissible heuristics. In heuristics based on
monotonic relaxation, the idea is that in a relaxed state each
variable assumes a set of values instead of just a single value.
With state constraints, the challenge becomes to evaluate the
conditions, such as goals and action preconditions, that involve
secondary variables. We employ consistency checking tools to
evaluate whether these conditions are satisfied in the relaxed
state. In our work with numerical constraints we use linear
programming, while with axioms we use answer set programming and
three value semantics. This allows us to build a relaxed planning
graph and compute constraint-aware version of heuristics based on
monotonic relaxation.
We also adapt pattern database heuristics. We notice that an
abstract state can be thought of as a state in the monotonic
relaxation in which the variables in the pattern hold only one
value, while the variables not in the pattern simultaneously hold
all the values in their domains. This means that we can apply the
same technique for evaluating conditions on secondary variables
as we did for the monotonic relaxation and build pattern
databases similarly as it is done in classical planning.
To make better use of our heuristics, we modify the A* algorithm
by combining two techniques that were previously used
independently – partial expansion and preferred operators. Our
modified algorithm, which we call PrefPEA, is most beneficial in
cases where heuristic is expensive to compute, but accurate, and
states have many successors
A Review of Symbolic, Subsymbolic and Hybrid Methods for Sequential Decision Making
The field of Sequential Decision Making (SDM) provides tools for solving
Sequential Decision Processes (SDPs), where an agent must make a series of
decisions in order to complete a task or achieve a goal. Historically, two
competing SDM paradigms have view for supremacy. Automated Planning (AP)
proposes to solve SDPs by performing a reasoning process over a model of the
world, often represented symbolically. Conversely, Reinforcement Learning (RL)
proposes to learn the solution of the SDP from data, without a world model, and
represent the learned knowledge subsymbolically. In the spirit of
reconciliation, we provide a review of symbolic, subsymbolic and hybrid methods
for SDM. We cover both methods for solving SDPs (e.g., AP, RL and techniques
that learn to plan) and for learning aspects of their structure (e.g., world
models, state invariants and landmarks). To the best of our knowledge, no other
review in the field provides the same scope. As an additional contribution, we
discuss what properties an ideal method for SDM should exhibit and argue that
neurosymbolic AI is the current approach which most closely resembles this
ideal method. Finally, we outline several proposals to advance the field of SDM
via the integration of symbolic and subsymbolic AI
Algorithms and Conditional Lower Bounds for Planning Problems
We consider planning problems for graphs, Markov decision processes (MDPs),
and games on graphs. While graphs represent the most basic planning model, MDPs
represent interaction with nature and games on graphs represent interaction
with an adversarial environment. We consider two planning problems where there
are k different target sets, and the problems are as follows: (a) the coverage
problem asks whether there is a plan for each individual target set, and (b)
the sequential target reachability problem asks whether the targets can be
reached in sequence. For the coverage problem, we present a linear-time
algorithm for graphs and quadratic conditional lower bound for MDPs and games
on graphs. For the sequential target problem, we present a linear-time
algorithm for graphs, a sub-quadratic algorithm for MDPs, and a quadratic
conditional lower bound for games on graphs. Our results with conditional lower
bounds establish (i) model-separation results showing that for the coverage
problem MDPs and games on graphs are harder than graphs and for the sequential
reachability problem games on graphs are harder than MDPs and graphs; (ii)
objective-separation results showing that for MDPs the coverage problem is
harder than the sequential target problem.Comment: Accepted at ICAPS'1
Inference and Learning with Planning Models
[ES] Inferencia y aprendizaje son los actos de razonar sobre evidencia recogida con el fin de alcanzar conclusiones lógicas sobre el proceso que la originó. En el contexto de un modelo de espacio de estados, inferencia y aprendizaje se refieren normalmente a explicar el comportamiento pasado de un agente, predecir sus acciones futuras, o identificar su modelo. En esta tesis, presentamos un marco para inferencia y aprendizaje en el modelo de espacio de estados subyacente al modelo de planificación clásica, y formulamos una paleta de problemas de inferencia y aprendizaje bajo este paraguas unificador. También desarrollamos métodos efectivos basados en planificación que nos permiten resolver estos problemas utilizando algoritmos de planificación genéricos del estado del arte. Mostraremos que un gran número de problemas de inferencia y aprendizaje claves que han sido tratados como desconectados se pueden formular de forma cohesiva y resolver siguiendo procedimientos homogéneos usando nuestro marco. Además, nuestro trabajo abre las puertas a nuevas aplicaciones para tecnologÃa de planificación ya que resalta las caracterÃsticas que hacen que el modelo de espacio de estados de planificación clásica sea diferente a los demás modelos.[CA] Inferència i aprenentatge són els actes de raonar sobre evidència arreplegada a fi d'aconseguir conclusions lògiques sobre el procés que la va originar. En el context d'un model d'espai d'estats, inferència i aprenentatge es referixen normalment a explicar el comportament passat d'un agent, predir les seues accions futures, o identificar el seu model. En esta tesi, presentem un marc per a inferència i aprenentatge en el model d'espai d'estats subjacent al model de planificació clà ssica, i formulem una paleta de problemes d'inferència i aprenentatge davall este paraigua unificador. També desenrotllem mètodes efectius basats en planificació que ens permeten resoldre estos problemes utilitzant algoritmes de planificació genèrics de l'estat de l'art. Mostrarem que un gran nombre de problemes d'inferència i aprenentatge claus que han sigut tractats com desconnectats es poden formular de forma cohesiva i resoldre seguint procediments homogenis usant el nostre marc. A més, el nostre treball obri les portes a noves aplicacions per a tecnologia de planificació ja que ressalta les caracterÃstiques que fan que el model d'espai d'estats de planificació clà ssica siga diferent dels altres models.[EN] Inference and learning are the acts of reasoning about some collected evidence in order to reach a logical conclusion regarding the process that originated it. In the context of a state-space model, inference and learning are usually concerned with explaining an agent's past behaviour, predicting its future actions or identifying its model. In this thesis, we present a framework for inference and learning in the state-space model underlying the classical planning model, and formulate a palette of inference and learning problems under this unifying umbrella. We also develop effective planning-based approaches to solve these problems using off-the-shelf, state-of-the-art planning algorithms. We will show that several core inference and learning problems that previous research has treated as disconnected can be formulated in a cohesive way and solved following homogeneous procedures using the proposed framework. Further, our work opens the way for new applications of planning technology as it highlights the features that make the state-space model of classical planning different from other models.The work developed in this doctoral thesis has been possible thanks to the FPU16/03184 fellowship that I have enjoyed for the duration of my PhD studies. I have also been supported by my advisors’ grants TIN2017-88476-C2-1-R, TIN2014-55637-C2-2-R-AR, and RYC-2015-18009.Aineto GarcÃa, D. (2022). Inference and Learning with Planning Models [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/18535
Egocentric Planning for Scalable Embodied Task Achievement
Embodied agents face significant challenges when tasked with performing
actions in diverse environments, particularly in generalizing across object
types and executing suitable actions to accomplish tasks. Furthermore, agents
should exhibit robustness, minimizing the execution of illegal actions. In this
work, we present Egocentric Planning, an innovative approach that combines
symbolic planning and Object-oriented POMDPs to solve tasks in complex
environments, harnessing existing models for visual perception and natural
language processing. We evaluated our approach in ALFRED, a simulated
environment designed for domestic tasks, and demonstrated its high scalability,
achieving an impressive 36.07% unseen success rate in the ALFRED benchmark and
winning the ALFRED challenge at CVPR Embodied AI workshop. Our method requires
reliable perception and the specification or learning of a symbolic description
of the preconditions and effects of the agent's actions, as well as what object
types reveal information about others. It is capable of naturally scaling to
solve new tasks beyond ALFRED, as long as they can be solved using the
available skills. This work offers a solid baseline for studying end-to-end and
hybrid methods that aim to generalize to new tasks, including recent approaches
relying on LLMs, but often struggle to scale to long sequences of actions or
produce robust plans for novel tasks
- …