7,805 research outputs found

    Sound Abstraction of Probabilistic Actions in The Constraint Mass Assignment Framework

    Full text link
    This paper provides a formal and practical framework for sound abstraction of probabilistic actions. We start by precisely defining the concept of sound abstraction within the context of finite-horizon planning (where each plan is a finite sequence of actions). Next we show that such abstraction cannot be performed within the traditional probabilistic action representation, which models a world with a single probability distribution over the state space. We then present the constraint mass assignment representation, which models the world with a set of probability distributions and is a generalization of mass assignment representations. Within this framework, we present sound abstraction procedures for three types of action abstraction. We end the paper with discussions and related work on sound and approximate abstraction. We give pointers to papers in which we discuss other sound abstraction-related issues, including applications, estimating loss due to abstraction, and automatically generating abstraction hierarchies.Comment: Appears in Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI1996

    Structured Reachability Analysis for Markov Decision Processes

    Full text link
    Recent research in decision theoretic planning has focussed on making the solution of Markov decision processes (MDPs) more feasible. We develop a family of algorithms for structured reachability analysis of MDPs that are suitable when an initial state (or set of states) is known. Using compact, structured representations of MDPs (e.g., Bayesian networks), our methods, which vary in the tradeoff between complexity and accuracy, produce structured descriptions of (estimated) reachable states that can be used to eliminate variables or variable values from the problem description, reducing the size of the MDP and making it easier to solve. One contribution of our work is the extension of ideas from GRAPHPLAN to deal with the distributed nature of action representations typically embodied within Bayes nets and the problem of correlated action effects. We also demonstrate that our algorithm can be made more complete by using k-ary constraints instead of binary constraints. Another contribution is the illustration of how the compact representation of reachability constraints can be exploited by several existing (exact and approximate) abstraction algorithms for MDPs.Comment: Appears in Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI1998

    Theoretical Foundations for Abstraction-Based Probabilistic Planning

    Full text link
    Modeling worlds and actions under uncertainty is one of the central problems in the framework of decision-theoretic planning. The representation must be general enough to capture real-world problems but at the same time it must provide a basis upon which theoretical results can be derived. The central notion in the framework we propose here is that of the affine-operator, which serves as a tool for constructing (convex) sets of probability distributions, and which can be considered as a generalization of belief functions and interval mass assignments. Uncertainty in the state of the worlds is modeled with sets of probability distributions, represented by affine-trees while actions are defined as tree-manipulators. A small set of key properties of the affine-operator is presented, forming the basis for most existing operator-based definitions of probabilistic action projection and action abstraction. We derive and prove correct three projection rules, which vividly illustrate the precision-complexity tradeoff in plan projection. Finally, we show how the three types of action abstraction identified by Haddawy and Doan are manifested in the present framework.Comment: Appears in Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI1996

    Information-Theoretic Considerations in Batch Reinforcement Learning

    Full text link
    Value-function approximation methods that operate in batch mode have foundational importance to reinforcement learning (RL). Finite sample guarantees for these methods often crucially rely on two types of assumptions: (1) mild distribution shift, and (2) representation conditions that are stronger than realizability. However, the necessity ("why do we need them?") and the naturalness ("when do they hold?") of such assumptions have largely eluded the literature. In this paper, we revisit these assumptions and provide theoretical results towards answering the above questions, and make steps towards a deeper understanding of value-function approximation.Comment: Published in ICML 201

    Proximity-Based Non-uniform Abstractions for Approximate Planning

    Full text link
    In a deterministic world, a planning agent can be certain of the consequences of its planned sequence of actions. Not so, however, in dynamic, stochastic domains where Markov decision processes are commonly used. Unfortunately these suffer from the curse of dimensionality: if the state space is a Cartesian product of many small sets (dimensions), planning is exponential in the number of those dimensions. Our new technique exploits the intuitive strategy of selectively ignoring various dimensions in different parts of the state space. The resulting non-uniformity has strong implications, since the approximation is no longer Markovian, requiring the use of a modified planner. We also use a spatial and temporal proximity measure, which responds to continued planning as well as movement of the agent through the state space, to dynamically adapt the abstraction as planning progresses. We present qualitative and quantitative results across a range of experimental domains showing that an agent exploiting this novel approximation method successfully finds solutions to the planning problem using much less than the full state space. We assess and analyse the features of domains which our method can exploit

    State-Continuity Approximation of Markov Decision Processes via Finite Element Methods for Autonomous System Planning

    Full text link
    Motion planning under uncertainty for an autonomous system can be formulated as a Markov Decision Process with a continuous state space. In this paper, we propose a novel solution to this decision-theoretic planning problem that directly obtains the continuous value function with only the first and second moments of the transition probabilities, alleviating the requirement for an explicit transition model in the literature. We achieve this by expressing the value function as a linear combination of basis functions and approximating the Bellman equation by a partial differential equation, where the value function can be naturally constructed using a finite element method. We have validated our approach via extensive simulations, and the evaluations reveal that to baseline methods, our solution leads to in terms of path smoothness, travel distance, and time costs.Comment: 9 pages, 6 figures, accepted by RA-

    A Method for Planning Given Uncertain and Incomplete Information

    Full text link
    This paper describes ongoing research into planning in an uncertain environment. In particular, it introduces U-Plan, a planning system that constructs quantitatively ranked plans given an incomplete description of the state of the world. U-Plan uses a DempsterShafer interval to characterise uncertain and incomplete information about the state of the world. The planner takes as input what is known about the world, and constructs a number of possible initial states with representations at different abstraction levels. A plan is constructed for the initial state with the greatest support, and this plan is tested to see if it will work for other possible initial states. All, part, or none of the existing plans may be used in the generation of the plans for the remaining possible worlds. Planning takes place in an abstraction hierarchy where strategic decisions are made before tactical decisions. A super-plan is then constructed, based on merging the set of plans and the appropriately timed acquisition of essential knowledge, which is used to decide between plan alternatives. U-Plan usually produces a super-plan in less time than a classical planner would take to produce a set of plans, one for each possible world.Comment: Appears in Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence (UAI1993

    Practical Linear Value-approximation Techniques for First-order MDPs

    Full text link
    Recent work on approximate linear programming (ALP) techniques for first-order Markov Decision Processes (FOMDPs) represents the value function linearly w.r.t. a set of first-order basis functions and uses linear programming techniques to determine suitable weights. This approach offers the advantage that it does not require simplification of the first-order value function, and allows one to solve FOMDPs independent of a specific domain instantiation. In this paper, we address several questions to enhance the applicability of this work: (1) Can we extend the first-order ALP framework to approximate policy iteration to address performance deficiencies of previous approaches? (2) Can we automatically generate basis functions and evaluate their impact on value function quality? (3) How can we decompose intractable problems with universally quantified rewards into tractable subproblems? We propose answers to these questions along with a number of novel optimizations and provide a comparative empirical evaluation on logistics problems from the ICAPS 2004 Probabilistic Planning Competition.Comment: Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006

    A Graph-Theoretic Analysis of Information Value

    Full text link
    We derive qualitative relationships about the informational relevance of variables in graphical decision models based on a consideration of the topology of the models. Specifically, we identify dominance relations for the expected value of information on chance variables in terms of their position and relationships in influence diagrams. The qualitative relationships can be harnessed to generate nonnumerical procedures for ordering uncertain variables in a decision model by their informational relevance.Comment: Appears in Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI1996

    Learning What Information to Give in Partially Observed Domains

    Full text link
    In many robotic applications, an autonomous agent must act within and explore a partially observed environment that is unobserved by its human teammate. We consider such a setting in which the agent can, while acting, transmit declarative information to the human that helps them understand aspects of this unseen environment. In this work, we address the algorithmic question of how the agent should plan out what actions to take and what information to transmit. Naturally, one would expect the human to have preferences, which we model information-theoretically by scoring transmitted information based on the change it induces in weighted entropy of the human's belief state. We formulate this setting as a belief MDP and give a tractable algorithm for solving it approximately. Then, we give an algorithm that allows the agent to learn the human's preferences online, through exploration. We validate our approach experimentally in simulated discrete and continuous partially observed search-and-recover domains. Visit http://tinyurl.com/chitnis-corl-18 for a supplementary video.Comment: CoRL 2018 final versio
    • …
    corecore