52 research outputs found
Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search
This paper considers the problem of active object recognition using touch
only. The focus is on adaptively selecting a sequence of wrist poses that
achieves accurate recognition by enclosure grasps. It seeks to minimize the
number of touches and maximize recognition confidence. The actions are
formulated as wrist poses relative to each other, making the algorithm
independent of absolute workspace coordinates. The optimal sequence is
approximated by Monte Carlo tree search. We demonstrate results in a physics
engine and on a real robot. In the physics engine, most object instances were
recognized in at most 16 grasps. On a real robot, our method recognized objects
in 2--9 grasps and outperformed a greedy baseline.Comment: Accepted to International Conference on Intelligent Robots and
Systems (IROS) 201
Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search
This paper considers the problem of active object recognition using touch
only. The focus is on adaptively selecting a sequence of wrist poses that
achieves accurate recognition by enclosure grasps. It seeks to minimize the
number of touches and maximize recognition confidence. The actions are
formulated as wrist poses relative to each other, making the algorithm
independent of absolute workspace coordinates. The optimal sequence is
approximated by Monte Carlo tree search. We demonstrate results in a physics
engine and on a real robot. In the physics engine, most object instances were
recognized in at most 16 grasps. On a real robot, our method recognized objects
in 2--9 grasps and outperformed a greedy baseline.Comment: Accepted to International Conference on Intelligent Robots and
Systems (IROS) 201
The Lov\'asz Hinge: A Novel Convex Surrogate for Submodular Losses
Learning with non-modular losses is an important problem when sets of
predictions are made simultaneously. The main tools for constructing convex
surrogate loss functions for set prediction are margin rescaling and slack
rescaling. In this work, we show that these strategies lead to tight convex
surrogates iff the underlying loss function is increasing in the number of
incorrect predictions. However, gradient or cutting-plane computation for these
functions is NP-hard for non-supermodular loss functions. We propose instead a
novel surrogate loss function for submodular losses, the Lov\'asz hinge, which
leads to O(p log p) complexity with O(p) oracle accesses to the loss function
to compute a gradient or cutting-plane. We prove that the Lov\'asz hinge is
convex and yields an extension. As a result, we have developed the first
tractable convex surrogates in the literature for submodular losses. We
demonstrate the utility of this novel convex surrogate through several set
prediction tasks, including on the PASCAL VOC and Microsoft COCO datasets
Adaptive Information Gathering via Imitation Learning
In the adaptive information gathering problem, a policy is required to select
an informative sensing location using the history of measurements acquired thus
far. While there is an extensive amount of prior work investigating effective
practical approximations using variants of Shannon's entropy, the efficacy of
such policies heavily depends on the geometric distribution of objects in the
world. On the other hand, the principled approach of employing online POMDP
solvers is rendered impractical by the need to explicitly sample online from a
posterior distribution of world maps.
We present a novel data-driven imitation learning framework to efficiently
train information gathering policies. The policy imitates a clairvoyant oracle
- an oracle that at train time has full knowledge about the world map and can
compute maximally informative sensing locations. We analyze the learnt policy
by showing that offline imitation of a clairvoyant oracle is implicitly
equivalent to online oracle execution in conjunction with posterior sampling.
This observation allows us to obtain powerful near-optimality guarantees for
information gathering problems possessing an adaptive sub-modularity property.
As demonstrated on a spectrum of 2D and 3D exploration problems, the trained
policies enjoy the best of both worlds - they adapt to different world map
distributions while being computationally inexpensive to evaluate.Comment: Robotics Science and Systems, 201
- …