65 research outputs found
Generalized planning: Non-deterministic abstractions and trajectory constraints
We study the characterization and computation of general policies for families of problems that share a structure characterized by a common reduction into a single abstract problem. Policies mu that solve the abstract problem P have been shown to solve all problems Q that reduce to P provided that mu terminates in Q. In this work, we shed light on why this termination condition is needed and how it can be removed. The key observation is that the abstract problem P captures the common structure among the concrete problems Q that is local (Markovian) but misses common structure that is global. We show how such global structure can be captured by means of trajectory constraints that in many cases can be expressed as LTL formulas, thus reducing generalized planning to LTL synthesis. Moreover, for a broad class of problems that involve integer variables that can be increased or decreased, trajectory constraints can be compiled away, reducing generalized planning to fully observable nondeterministic planning
A Correctness Result for Synthesizing Plans With Loops in Stochastic Domains
Finite-state controllers (FSCs), such as plans with loops, are powerful and
compact representations of action selection widely used in robotics, video
games and logistics. There has been steady progress on synthesizing FSCs in
deterministic environments, but the algorithmic machinery needed for lifting
such techniques to stochastic environments is not yet fully understood. While
the derivation of FSCs has received some attention in the context of discounted
expected reward measures, they are often solved approximately and/or without
correctness guarantees. In essence, that makes it difficult to analyze
fundamental concerns such as: do all paths terminate, and do the majority of
paths reach a goal state?
In this paper, we present new theoretical results on a generic technique for
synthesizing FSCs in stochastic environments, allowing for highly granular
specifications on termination and goal satisfaction
Learning Features and Abstract Actions for Computing Generalized Plans
Generalized planning is concerned with the computation of plans that solve
not one but multiple instances of a planning domain. Recently, it has been
shown that generalized plans can be expressed as mappings of feature values
into actions, and that they can often be computed with fully observable
non-deterministic (FOND) planners. The actions in such plans, however, are not
the actions in the instances themselves, which are not necessarily common to
other instances, but abstract actions that are defined on a set of common
features. The formulation assumes that the features and the abstract actions
are given. In this work, we address this limitation by showing how to learn
them automatically. The resulting account of generalized planning combines
learning and planning in a novel way: a learner, based on a Max SAT
formulation, yields the features and abstract actions from sampled state
transitions, and a FOND planner uses this information, suitably transformed, to
produce the general plans. Correctness guarantees are given and experimental
results on several domains are reported.Comment: Preprint of paper accepted at AAAI'19 conferenc
Learning Finite State Controllers from Simulation
Abstract. We propose a methodology to automatically generate agent controllers, represented as state machines, to act in partially observable environments. We define a multi-step process, in which increasingly accurate models- generally too complex to be used for planning- are employed to generate possible traces of execution by simulation. Those traces are then utilized to induce a state machine, that represents all reasonable behaviors, given the approximate models and planners previously used. The state machine will have multiple possible choices in some of its states. Those states are choice points, and we defer the learning of those choices to the deployment of the agent in the real environment. The controller obtained can therefore adapt to the actual environment, limiting the search space in a sensible way
Learning STRIPS Action Models with Classical Planning
This paper presents a novel approach for learning STRIPS action models from
examples that compiles this inductive learning task into a classical planning
task. Interestingly, the compilation approach is flexible to different amounts
of available input knowledge; the learning examples can range from a set of
plans (with their corresponding initial and final states) to just a pair of
initial and final states (no intermediate action or state is given). Moreover,
the compilation accepts partially specified action models and it can be used to
validate whether the observation of a plan execution follows a given STRIPS
action model, even if this model is not fully specified.Comment: 8+1 pages, 4 figures, 6 table
IST Austria Technical Report
POMDPs are standard models for probabilistic planning problems, where an agent interacts with an uncertain environment. We study the problem of almost-sure reachability, where given a set of target states, the question is to decide whether there is a policy to ensure that the target set is reached with probability 1 (almost-surely). While in general the problem is EXPTIME-complete, in many practical cases policies with a small amount of memory suffice. Moreover, the existing solution to the problem is explicit, which first requires to construct explicitly an exponential reduction to a belief-support MDP. In this work, we first study the existence of observation-stationary strategies, which is NP-complete, and then small-memory strategies. We present a symbolic algorithm by an efficient encoding to SAT and using a SAT solver for the problem. We report experimental results demonstrating the scalability of our symbolic (SAT-based) approach
Generalized Potential Heuristics for Classical Planning
Generalized planning aims at computing solutions that work for all instances of the same domain. In this paper, we show that several interesting planning domains possess compact generalized heuristics that can guide a greedy search in guaranteed polynomial time to the goal, and which work for any instance of the domain . These heuristics are weighted sums of state features that capture the number of objects satisfying a certain first-order logic property in any given state. These features have a meaningful interpretation and generalize naturally to the whole domain. Additionally, we present an approach based on mixed integer linear programming to compute such heuristics automatically from the observation of small training instances. We develop two variations of the approach that progressively refine the heuristic as new states are encountered. We illustrate the approach empirically on a number of standard domains, where we show that the generated heuristics will correctly generalize to all possible instances
- âŠ