34 research outputs found

    Temporally Extended Goal Recognition in Fully Observable Non-Deterministic Domain Models

    Get PDF
    arXiv admin note: substantial text overlap with arXiv:2103.11692Preprin

    Towards a Unified View of AI Planning and Reactive Synthesis

    Get PDF
    International audienceAutomated planning and reactive synthesis are well-established techniques for sequential decision making. In this paper we examine a collection of AI planning problems with temporally extended goals, specified in Linear Temporal Logic (LTL). We characterize these so-called LTL planning problems as two-player games and thereby establish their correspondence to reactive synthesis problems. This unifying view furthers our understanding of the relationship between plan and program synthesis, establishing complexity results for LTL planning tasks. Building on this correspondence, we identify restricted fragments of LTL for which plan synthesis can be realized more efficiently

    Temporally extended goal recognition in fully observable non-deterministic domain models

    Get PDF
    This work has been partially supported by the ERC-ADGWhiteMech (No. 834228), the EU ICT-48 2020 project TAILOR (No. 952215), the PRIN project RIPER (No. 20203FFYLK),and the PNRR MUR project FAIR (No. PE0000013).Peer reviewedPublisher PD

    Stochastic Fairness and Language-Theoretic Fairness in Planning in Nondeterministic Domains

    Get PDF
    We address two central notions of fairness in the literature of nondeterministic fully observable domains. The first, which we call stochastic fairness, is classical, and assumes an environment which operates probabilistically using possibly unknown probabilities. The second, which is language-theoretic, assumes that if an action is taken from a given state infinitely often then all its possible outcomes should appear infinitely often; we call this state-action fairness. While the two notions coincide for standard reachability goals, they differ for temporally extended goals. This important difference has been overlooked in the planning literature and has led to the use of a product-based reduction in a number of published algorithms which were stated for state-action fairness, for which they are incorrect, while being correct for stochastic fairness. We remedy this and provide a correct optimal algorithm for solving state-action fair planning for LTL/LTLf goals, as well as a correct proof of the lower bound of the goal-complexity. Our proof is general enough that it also pro- vides, for the no-fairness and stochastic-fairness cases, multiple missing lower bounds and new proofs of known lower bounds. Overall, we show that stochastic fairness is better behaved than state-action fairness

    Pure-Past Linear Temporal and Dynamic Logic on Finite Traces

    Get PDF
    LTLf and LDLf are well-known logics on finite traces. We review PLTLf and PLDLf, their pure- past versions. These are interpreted backward from the end of the trace towards the beginning. Because of this, we can exploit a foundational result on reverse languages to get an exponential improvement, wrt LTLf /LDLf, in computing the corresponding DFA. This exponential improvement is reflected in several forms sequential decision making involving temporal specifications, such as planning and decision problems in non-deterministic and non-Markovian domains. Interestingly, PLTLf (resp. PLDLf ) has the same expressive power as LTLf (resp. LDLf ), but transforming a PLTLf (resp. PLDLf ) formula into its equivalent in LTLf (resp. LDLf ) is quite expensive. Hence, to take advantage of the exponential improvement, properties of interest must be directly expressed in PLTLf /PLTLf

    High-level Programming via Generalized Planning and LTL Synthesis

    Get PDF
    We look at program synthesis where the aim is to automatically synthesize a controller that operates on data structures and from which a concrete program can be easily derived. We do not aim at a fully-automatic process or tool that produces a program meeting a given specification of the program’s behaviour. Rather, we aim at the design of a clear and well- founded approach for supporting programmers at the design and implementation phases. Concretely, we first show that a program synthesis task can be modeled as a generalized planning problem. This is done at an abstraction level where the involved data structures are seen as black-boxes that can be interfaced with actions and observations, the first corresponding to the operations and the second to the queries provided by the data structure. The abstraction level is high enough to capture intuitive and common assumptions as well as general and simple strategies used by programmers, and yet it contains sufficient structure to support the automated generation of concrete solutions (in the form of controllers). From such controllers and the use of standard data structures, an actual program in a general language like C++ or Python can be easily obtained. Then, we discuss how the resulting generalized planning problem can be reduced to an LTL synthesis problem, thus making available any LTL synthesis engine for obtaining the controllers. We illustrate the effectiveness of the approach on a series of examples

    Enablingmarkovian representations under imperfect information

    Get PDF
    Markovian systems are widely used in reinforcement learning (RL), when the successful completion of a task depends exclusively on the last interaction between an autonomous agent and its environment. Unfortunately, real-world instructions are typically complex and often better described as non-Markovian. In this paper we present an extension method that allows solving partially-observable non-Markovian reward decision processes (PONMRDPs) by solving equivalent Markovian models. This potentially facilitates Markovian-based state-of-the-art techniques, including RL, to find optimal behaviours for problems best described as PONMRDP. We provide formal optimality guarantees of our extension methods together with a counterexample illustrating that naive extensions from existing techniques in fully-observable environments cannot provide such guarantees
    corecore