7,805 research outputs found
Sound Abstraction of Probabilistic Actions in The Constraint Mass Assignment Framework
This paper provides a formal and practical framework for sound abstraction of
probabilistic actions. We start by precisely defining the concept of sound
abstraction within the context of finite-horizon planning (where each plan is a
finite sequence of actions). Next we show that such abstraction cannot be
performed within the traditional probabilistic action representation, which
models a world with a single probability distribution over the state space. We
then present the constraint mass assignment representation, which models the
world with a set of probability distributions and is a generalization of mass
assignment representations. Within this framework, we present sound abstraction
procedures for three types of action abstraction. We end the paper with
discussions and related work on sound and approximate abstraction. We give
pointers to papers in which we discuss other sound abstraction-related issues,
including applications, estimating loss due to abstraction, and automatically
generating abstraction hierarchies.Comment: Appears in Proceedings of the Twelfth Conference on Uncertainty in
Artificial Intelligence (UAI1996
Structured Reachability Analysis for Markov Decision Processes
Recent research in decision theoretic planning has focussed on making the
solution of Markov decision processes (MDPs) more feasible. We develop a family
of algorithms for structured reachability analysis of MDPs that are suitable
when an initial state (or set of states) is known. Using compact, structured
representations of MDPs (e.g., Bayesian networks), our methods, which vary in
the tradeoff between complexity and accuracy, produce structured descriptions
of (estimated) reachable states that can be used to eliminate variables or
variable values from the problem description, reducing the size of the MDP and
making it easier to solve. One contribution of our work is the extension of
ideas from GRAPHPLAN to deal with the distributed nature of action
representations typically embodied within Bayes nets and the problem of
correlated action effects. We also demonstrate that our algorithm can be made
more complete by using k-ary constraints instead of binary constraints. Another
contribution is the illustration of how the compact representation of
reachability constraints can be exploited by several existing (exact and
approximate) abstraction algorithms for MDPs.Comment: Appears in Proceedings of the Fourteenth Conference on Uncertainty in
Artificial Intelligence (UAI1998
Theoretical Foundations for Abstraction-Based Probabilistic Planning
Modeling worlds and actions under uncertainty is one of the central problems
in the framework of decision-theoretic planning. The representation must be
general enough to capture real-world problems but at the same time it must
provide a basis upon which theoretical results can be derived. The central
notion in the framework we propose here is that of the affine-operator, which
serves as a tool for constructing (convex) sets of probability distributions,
and which can be considered as a generalization of belief functions and
interval mass assignments. Uncertainty in the state of the worlds is modeled
with sets of probability distributions, represented by affine-trees while
actions are defined as tree-manipulators. A small set of key properties of the
affine-operator is presented, forming the basis for most existing
operator-based definitions of probabilistic action projection and action
abstraction. We derive and prove correct three projection rules, which vividly
illustrate the precision-complexity tradeoff in plan projection. Finally, we
show how the three types of action abstraction identified by Haddawy and Doan
are manifested in the present framework.Comment: Appears in Proceedings of the Twelfth Conference on Uncertainty in
Artificial Intelligence (UAI1996
Information-Theoretic Considerations in Batch Reinforcement Learning
Value-function approximation methods that operate in batch mode have
foundational importance to reinforcement learning (RL). Finite sample
guarantees for these methods often crucially rely on two types of assumptions:
(1) mild distribution shift, and (2) representation conditions that are
stronger than realizability. However, the necessity ("why do we need them?")
and the naturalness ("when do they hold?") of such assumptions have largely
eluded the literature. In this paper, we revisit these assumptions and provide
theoretical results towards answering the above questions, and make steps
towards a deeper understanding of value-function approximation.Comment: Published in ICML 201
Proximity-Based Non-uniform Abstractions for Approximate Planning
In a deterministic world, a planning agent can be certain of the consequences
of its planned sequence of actions. Not so, however, in dynamic, stochastic
domains where Markov decision processes are commonly used. Unfortunately these
suffer from the curse of dimensionality: if the state space is a Cartesian
product of many small sets (dimensions), planning is exponential in the number
of those dimensions.
Our new technique exploits the intuitive strategy of selectively ignoring
various dimensions in different parts of the state space. The resulting
non-uniformity has strong implications, since the approximation is no longer
Markovian, requiring the use of a modified planner. We also use a spatial and
temporal proximity measure, which responds to continued planning as well as
movement of the agent through the state space, to dynamically adapt the
abstraction as planning progresses.
We present qualitative and quantitative results across a range of
experimental domains showing that an agent exploiting this novel approximation
method successfully finds solutions to the planning problem using much less
than the full state space. We assess and analyse the features of domains which
our method can exploit
State-Continuity Approximation of Markov Decision Processes via Finite Element Methods for Autonomous System Planning
Motion planning under uncertainty for an autonomous system can be formulated
as a Markov Decision Process with a continuous state space. In this paper, we
propose a novel solution to this decision-theoretic planning problem that
directly obtains the continuous value function with only the first and second
moments of the transition probabilities, alleviating the requirement for an
explicit transition model in the literature. We achieve this by expressing the
value function as a linear combination of basis functions and approximating the
Bellman equation by a partial differential equation, where the value function
can be naturally constructed using a finite element method. We have validated
our approach via extensive simulations, and the evaluations reveal that to
baseline methods, our solution leads to in terms of path smoothness, travel
distance, and time costs.Comment: 9 pages, 6 figures, accepted by RA-
A Method for Planning Given Uncertain and Incomplete Information
This paper describes ongoing research into planning in an uncertain
environment. In particular, it introduces U-Plan, a planning system that
constructs quantitatively ranked plans given an incomplete description of the
state of the world. U-Plan uses a DempsterShafer interval to characterise
uncertain and incomplete information about the state of the world. The planner
takes as input what is known about the world, and constructs a number of
possible initial states with representations at different abstraction levels. A
plan is constructed for the initial state with the greatest support, and this
plan is tested to see if it will work for other possible initial states. All,
part, or none of the existing plans may be used in the generation of the plans
for the remaining possible worlds. Planning takes place in an abstraction
hierarchy where strategic decisions are made before tactical decisions. A
super-plan is then constructed, based on merging the set of plans and the
appropriately timed acquisition of essential knowledge, which is used to decide
between plan alternatives. U-Plan usually produces a super-plan in less time
than a classical planner would take to produce a set of plans, one for each
possible world.Comment: Appears in Proceedings of the Ninth Conference on Uncertainty in
Artificial Intelligence (UAI1993
Practical Linear Value-approximation Techniques for First-order MDPs
Recent work on approximate linear programming (ALP) techniques for
first-order Markov Decision Processes (FOMDPs) represents the value function
linearly w.r.t. a set of first-order basis functions and uses linear
programming techniques to determine suitable weights. This approach offers the
advantage that it does not require simplification of the first-order value
function, and allows one to solve FOMDPs independent of a specific domain
instantiation. In this paper, we address several questions to enhance the
applicability of this work: (1) Can we extend the first-order ALP framework to
approximate policy iteration to address performance deficiencies of previous
approaches? (2) Can we automatically generate basis functions and evaluate
their impact on value function quality? (3) How can we decompose intractable
problems with universally quantified rewards into tractable subproblems? We
propose answers to these questions along with a number of novel optimizations
and provide a comparative empirical evaluation on logistics problems from the
ICAPS 2004 Probabilistic Planning Competition.Comment: Appears in Proceedings of the Twenty-Second Conference on Uncertainty
in Artificial Intelligence (UAI2006
A Graph-Theoretic Analysis of Information Value
We derive qualitative relationships about the informational relevance of
variables in graphical decision models based on a consideration of the topology
of the models. Specifically, we identify dominance relations for the expected
value of information on chance variables in terms of their position and
relationships in influence diagrams. The qualitative relationships can be
harnessed to generate nonnumerical procedures for ordering uncertain variables
in a decision model by their informational relevance.Comment: Appears in Proceedings of the Twelfth Conference on Uncertainty in
Artificial Intelligence (UAI1996
Learning What Information to Give in Partially Observed Domains
In many robotic applications, an autonomous agent must act within and explore
a partially observed environment that is unobserved by its human teammate. We
consider such a setting in which the agent can, while acting, transmit
declarative information to the human that helps them understand aspects of this
unseen environment. In this work, we address the algorithmic question of how
the agent should plan out what actions to take and what information to
transmit. Naturally, one would expect the human to have preferences, which we
model information-theoretically by scoring transmitted information based on the
change it induces in weighted entropy of the human's belief state. We formulate
this setting as a belief MDP and give a tractable algorithm for solving it
approximately. Then, we give an algorithm that allows the agent to learn the
human's preferences online, through exploration. We validate our approach
experimentally in simulated discrete and continuous partially observed
search-and-recover domains. Visit http://tinyurl.com/chitnis-corl-18 for a
supplementary video.Comment: CoRL 2018 final versio
- …