4,384 research outputs found
Adaptive Information Gathering via Imitation Learning
In the adaptive information gathering problem, a policy is required to select
an informative sensing location using the history of measurements acquired thus
far. While there is an extensive amount of prior work investigating effective
practical approximations using variants of Shannon's entropy, the efficacy of
such policies heavily depends on the geometric distribution of objects in the
world. On the other hand, the principled approach of employing online POMDP
solvers is rendered impractical by the need to explicitly sample online from a
posterior distribution of world maps.
We present a novel data-driven imitation learning framework to efficiently
train information gathering policies. The policy imitates a clairvoyant oracle
- an oracle that at train time has full knowledge about the world map and can
compute maximally informative sensing locations. We analyze the learnt policy
by showing that offline imitation of a clairvoyant oracle is implicitly
equivalent to online oracle execution in conjunction with posterior sampling.
This observation allows us to obtain powerful near-optimality guarantees for
information gathering problems possessing an adaptive sub-modularity property.
As demonstrated on a spectrum of 2D and 3D exploration problems, the trained
policies enjoy the best of both worlds - they adapt to different world map
distributions while being computationally inexpensive to evaluate.Comment: Robotics Science and Systems, 201
Learning to Resolve Conflicts for Multi-Agent Path Finding with Conflict-Based Search
Conflict-Based Search (CBS) is a state-of-the-art algorithm for multi-agent
path finding. At the high level, CBS repeatedly detects conflicts and resolves
one of them by splitting the current problem into two subproblems. Previous
work chooses the conflict to resolve by categorizing the conflict into three
classes and always picking a conflict from the highest-priority class. In this
work, we propose an oracle for conflict selection that results in smaller
search tree sizes than the one used in previous work. However, the computation
of the oracle is slow. Thus, we propose a machine-learning framework for
conflict selection that observes the decisions made by the oracle and learns a
conflict-selection strategy represented by a linear ranking function that
imitates the oracle's decisions accurately and quickly. Experiments on
benchmark maps indicate that our method significantly improves the success
rates, the search tree sizes and runtimes over the current state-of-the-art CBS
solver
Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners
In real-world applications of education, an effective teacher adaptively
chooses the next example to teach based on the learner's current state.
However, most existing work in algorithmic machine teaching focuses on the
batch setting, where adaptivity plays no role. In this paper, we study the case
of teaching consistent, version space learners in an interactive setting. At
any time step, the teacher provides an example, the learner performs an update,
and the teacher observes the learner's new state. We highlight that adaptivity
does not speed up the teaching process when considering existing models of
version space learners, such as "worst-case" (the learner picks the next
hypothesis randomly from the version space) and "preference-based" (the learner
picks hypothesis according to some global preference). Inspired by human
teaching, we propose a new model where the learner picks hypotheses according
to some local preference defined by the current hypothesis. We show that our
model exhibits several desirable properties, e.g., adaptivity plays a key role,
and the learner's transitions over hypotheses are smooth/interpretable. We
develop efficient teaching algorithms and demonstrate our results via
simulation and user studies.Comment: NeurIPS 2018 (extended version
Learning from Experience for Rapid Generation of Local Car Maneuvers
Being able to rapidly respond to the changing scenes and traffic situations
by generating feasible local paths is of pivotal importance for car autonomy.
We propose to train a deep neural network (DNN) to plan feasible and
nearly-optimal paths for kinematically constrained vehicles in small constant
time. Our DNN model is trained using a novel weakly supervised approach and a
gradient-based policy search. On real and simulated scenes and a large set of
local planning problems, we demonstrate that our approach outperforms the
existing planners with respect to the number of successfully completed tasks.
While the path generation time is about 40 ms, the generated paths are smooth
and comparable to those obtained from conventional path planners
- …