41,272 research outputs found
Cognitive apprenticeship : teaching the craft of reading, writing, and mathtematics
Includes bibliographical references (p. 25-27)This research was supported by the National Institute of Education under Contract no. US-NIE-C-400-81-0030 and the Office of Naval Research under Contract No. N00014-85-C-002
Learning Generalized Reactive Policies using Deep Neural Networks
We present a new approach to learning for planning, where knowledge acquired
while solving a given set of planning problems is used to plan faster in
related, but new problem instances. We show that a deep neural network can be
used to learn and represent a \emph{generalized reactive policy} (GRP) that
maps a problem instance and a state to an action, and that the learned GRPs
efficiently solve large classes of challenging problem instances. In contrast
to prior efforts in this direction, our approach significantly reduces the
dependence of learning on handcrafted domain knowledge or feature selection.
Instead, the GRP is trained from scratch using a set of successful execution
traces. We show that our approach can also be used to automatically learn a
heuristic function that can be used in directed search algorithms. We evaluate
our approach using an extensive suite of experiments on two challenging
planning problem domains and show that our approach facilitates learning
complex decision making policies and powerful heuristic functions with minimal
human input. Videos of our results are available at goo.gl/Hpy4e3
Adaptive Information Gathering via Imitation Learning
In the adaptive information gathering problem, a policy is required to select
an informative sensing location using the history of measurements acquired thus
far. While there is an extensive amount of prior work investigating effective
practical approximations using variants of Shannon's entropy, the efficacy of
such policies heavily depends on the geometric distribution of objects in the
world. On the other hand, the principled approach of employing online POMDP
solvers is rendered impractical by the need to explicitly sample online from a
posterior distribution of world maps.
We present a novel data-driven imitation learning framework to efficiently
train information gathering policies. The policy imitates a clairvoyant oracle
- an oracle that at train time has full knowledge about the world map and can
compute maximally informative sensing locations. We analyze the learnt policy
by showing that offline imitation of a clairvoyant oracle is implicitly
equivalent to online oracle execution in conjunction with posterior sampling.
This observation allows us to obtain powerful near-optimality guarantees for
information gathering problems possessing an adaptive sub-modularity property.
As demonstrated on a spectrum of 2D and 3D exploration problems, the trained
policies enjoy the best of both worlds - they adapt to different world map
distributions while being computationally inexpensive to evaluate.Comment: Robotics Science and Systems, 201
Qualitative Analysis of POMDPs with Temporal Logic Specifications for Robotics Applications
We consider partially observable Markov decision processes (POMDPs), that are
a standard framework for robotics applications to model uncertainties present
in the real world, with temporal logic specifications. All temporal logic
specifications in linear-time temporal logic (LTL) can be expressed as parity
objectives. We study the qualitative analysis problem for POMDPs with parity
objectives that asks whether there is a controller (policy) to ensure that the
objective holds with probability 1 (almost-surely). While the qualitative
analysis of POMDPs with parity objectives is undecidable, recent results show
that when restricted to finite-memory policies the problem is EXPTIME-complete.
While the problem is intractable in theory, we present a practical approach to
solve the qualitative analysis problem. We designed several heuristics to deal
with the exponential complexity, and have used our implementation on a number
of well-known POMDP examples for robotics applications. Our results provide the
first practical approach to solve the qualitative analysis of robot motion
planning with LTL properties in the presence of uncertainty
- …