15,529 research outputs found
More Than a Feeling: Learning to Grasp and Regrasp using Vision and Touch
For humans, the process of grasping an object relies heavily on rich tactile
feedback. Most recent robotic grasping work, however, has been based only on
visual input, and thus cannot easily benefit from feedback after initiating
contact. In this paper, we investigate how a robot can learn to use tactile
information to iteratively and efficiently adjust its grasp. To this end, we
propose an end-to-end action-conditional model that learns regrasping policies
from raw visuo-tactile data. This model -- a deep, multimodal convolutional
network -- predicts the outcome of a candidate grasp adjustment, and then
executes a grasp by iteratively selecting the most promising actions. Our
approach requires neither calibration of the tactile sensors, nor any
analytical modeling of contact forces, thus reducing the engineering effort
required to obtain efficient grasping policies. We train our model with data
from about 6,450 grasping trials on a two-finger gripper equipped with GelSight
high-resolution tactile sensors on each finger. Across extensive experiments,
our approach outperforms a variety of baselines at (i) estimating grasp
adjustment outcomes, (ii) selecting efficient grasp adjustments for quick
grasping, and (iii) reducing the amount of force applied at the fingers, while
maintaining competitive performance. Finally, we study the choices made by our
model and show that it has successfully acquired useful and interpretable
grasping behaviors.Comment: 8 pages. Published on IEEE Robotics and Automation Letters (RAL).
Website: https://sites.google.com/view/more-than-a-feelin
Domain Randomization and Generative Models for Robotic Grasping
Deep learning-based robotic grasping has made significant progress thanks to
algorithmic improvements and increased data availability. However,
state-of-the-art models are often trained on as few as hundreds or thousands of
unique object instances, and as a result generalization can be a challenge.
In this work, we explore a novel data generation pipeline for training a deep
neural network to perform grasp planning that applies the idea of domain
randomization to object synthesis. We generate millions of unique, unrealistic
procedurally generated objects, and train a deep neural network to perform
grasp planning on these objects.
Since the distribution of successful grasps for a given object can be highly
multimodal, we propose an autoregressive grasp planning model that maps sensor
inputs of a scene to a probability distribution over possible grasps. This
model allows us to sample grasps efficiently at test time (or avoid sampling
entirely).
We evaluate our model architecture and data generation pipeline in simulation
and the real world. We find we can achieve a 90% success rate on previously
unseen realistic objects at test time in simulation despite having only been
trained on random objects. We also demonstrate an 80% success rate on
real-world grasp attempts despite having only been trained on random simulated
objects.Comment: 8 pages, 11 figures. Submitted to 2018 IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS 2018
Construction of Latent Descriptor Space and Inference Model of Hand-Object Interactions
Appearance-based generic object recognition is a challenging problem because
all possible appearances of objects cannot be registered, especially as new
objects are produced every day. Function of objects, however, has a
comparatively small number of prototypes. Therefore, function-based
classification of new objects could be a valuable tool for generic object
recognition. Object functions are closely related to hand-object interactions
during handling of a functional object; i.e., how the hand approaches the
object, which parts of the object and contact the hand, and the shape of the
hand during interaction. Hand-object interactions are helpful for modeling
object functions. However, it is difficult to assign discrete labels to
interactions because an object shape and grasping hand-postures intrinsically
have continuous variations. To describe these interactions, we propose the
interaction descriptor space which is acquired from unlabeled appearances of
human hand-object interactions. By using interaction descriptors, we can
numerically describe the relation between an object's appearance and its
possible interaction with the hand. The model infers the quantitative state of
the interaction from the object image alone. It also identifies the parts of
objects designed for hand interactions such as grips and handles. We
demonstrate that the proposed method can unsupervisedly generate interaction
descriptors that make clusters corresponding to interaction types. And also we
demonstrate that the model can infer possible hand-object interactions
- …