370 research outputs found
More Than a Feeling: Learning to Grasp and Regrasp using Vision and Touch
For humans, the process of grasping an object relies heavily on rich tactile
feedback. Most recent robotic grasping work, however, has been based only on
visual input, and thus cannot easily benefit from feedback after initiating
contact. In this paper, we investigate how a robot can learn to use tactile
information to iteratively and efficiently adjust its grasp. To this end, we
propose an end-to-end action-conditional model that learns regrasping policies
from raw visuo-tactile data. This model -- a deep, multimodal convolutional
network -- predicts the outcome of a candidate grasp adjustment, and then
executes a grasp by iteratively selecting the most promising actions. Our
approach requires neither calibration of the tactile sensors, nor any
analytical modeling of contact forces, thus reducing the engineering effort
required to obtain efficient grasping policies. We train our model with data
from about 6,450 grasping trials on a two-finger gripper equipped with GelSight
high-resolution tactile sensors on each finger. Across extensive experiments,
our approach outperforms a variety of baselines at (i) estimating grasp
adjustment outcomes, (ii) selecting efficient grasp adjustments for quick
grasping, and (iii) reducing the amount of force applied at the fingers, while
maintaining competitive performance. Finally, we study the choices made by our
model and show that it has successfully acquired useful and interpretable
grasping behaviors.Comment: 8 pages. Published on IEEE Robotics and Automation Letters (RAL).
Website: https://sites.google.com/view/more-than-a-feelin
Tactile Mapping and Localization from High-Resolution Tactile Imprints
This work studies the problem of shape reconstruction and object localization
using a vision-based tactile sensor, GelSlim. The main contributions are the
recovery of local shapes from contact, an approach to reconstruct the tactile
shape of objects from tactile imprints, and an accurate method for object
localization of previously reconstructed objects. The algorithms can be applied
to a large variety of 3D objects and provide accurate tactile feedback for
in-hand manipulation. Results show that by exploiting the dense tactile
information we can reconstruct the shape of objects with high accuracy and do
on-line object identification and localization, opening the door to reactive
manipulation guided by tactile sensing. We provide videos and supplemental
information in the project's website
http://web.mit.edu/mcube/research/tactile_localization.html.Comment: ICRA 2019, 7 pages, 7 figures. Website:
http://web.mit.edu/mcube/research/tactile_localization.html Video:
https://youtu.be/uMkspjmDbq
Self-Supervised Visuo-Tactile Pretraining to Locate and Follow Garment Features
Humans make extensive use of vision and touch as complementary senses, with
vision providing global information about the scene and touch measuring local
information during manipulation without suffering from occlusions. While prior
work demonstrates the efficacy of tactile sensing for precise manipulation of
deformables, they typically rely on supervised, human-labeled datasets. We
propose Self-Supervised Visuo-Tactile Pretraining (SSVTP), a framework for
learning multi-task visuo-tactile representations in a self-supervised manner
through cross-modal supervision. We design a mechanism that enables a robot to
autonomously collect precisely spatially-aligned visual and tactile image
pairs, then train visual and tactile encoders to embed these pairs into a
shared latent space using cross-modal contrastive loss. We apply this latent
space to downstream perception and control of deformable garments on flat
surfaces, and evaluate the flexibility of the learned representations without
fine-tuning on 5 tasks: feature classification, contact localization, anomaly
detection, feature search from a visual query (e.g., garment feature
localization under occlusion), and edge following along cloth edges. The
pretrained representations achieve a 73-100% success rate on these 5 tasks.Comment: RSS 2023, site: https://sites.google.com/berkeley.edu/ssvt
Push to know! -- Visuo-Tactile based Active Object Parameter Inference with Dual Differentiable Filtering
For robotic systems to interact with objects in dynamic environments, it is
essential to perceive the physical properties of the objects such as shape,
friction coefficient, mass, center of mass, and inertia. This not only eases
selecting manipulation action but also ensures the task is performed as
desired. However, estimating the physical properties of especially novel
objects is a challenging problem, using either vision or tactile sensing. In
this work, we propose a novel framework to estimate key object parameters using
non-prehensile manipulation using vision and tactile sensing. Our proposed
active dual differentiable filtering (ADDF) approach as part of our framework
learns the object-robot interaction during non-prehensile object push to infer
the object's parameters. Our proposed method enables the robotic system to
employ vision and tactile information to interactively explore a novel object
via non-prehensile object push. The novel proposed N-step active formulation
within the differentiable filtering facilitates efficient learning of the
object-robot interaction model and during inference by selecting the next best
exploratory push actions (where to push? and how to push?). We extensively
evaluated our framework in simulation and real-robotic scenarios, yielding
superior performance to the state-of-the-art baseline.Comment: 8 pages. Accepted at IROS 202
Sensorimotor representation learning for an "active self" in robots: A model survey
Safe human-robot interactions require robots to be able to learn how to
behave appropriately in \sout{humans' world} \rev{spaces populated by people}
and thus to cope with the challenges posed by our dynamic and unstructured
environment, rather than being provided a rigid set of rules for operations. In
humans, these capabilities are thought to be related to our ability to perceive
our body in space, sensing the location of our limbs during movement, being
aware of other objects and agents, and controlling our body parts to interact
with them intentionally. Toward the next generation of robots with bio-inspired
capacities, in this paper, we first review the developmental processes of
underlying mechanisms of these abilities: The sensory representations of body
schema, peripersonal space, and the active self in humans. Second, we provide a
survey of robotics models of these sensory representations and robotics models
of the self; and we compare these models with the human counterparts. Finally,
we analyse what is missing from these robotics models and propose a theoretical
computational framework, which aims to allow the emergence of the sense of self
in artificial agents by developing sensory representations through
self-exploration
3D Shape Perception from Monocular Vision, Touch, and Shape Priors
Perceiving accurate 3D object shape is important for robots to interact with
the physical world. Current research along this direction has been primarily
relying on visual observations. Vision, however useful, has inherent
limitations due to occlusions and the 2D-3D ambiguities, especially for
perception with a monocular camera. In contrast, touch gets precise local shape
information, though its efficiency for reconstructing the entire shape could be
low. In this paper, we propose a novel paradigm that efficiently perceives
accurate 3D object shape by incorporating visual and tactile observations, as
well as prior knowledge of common object shapes learned from large-scale shape
repositories. We use vision first, applying neural networks with learned shape
priors to predict an object's 3D shape from a single-view color image. We then
use tactile sensing to refine the shape; the robot actively touches the object
regions where the visual prediction has high uncertainty. Our method
efficiently builds the 3D shape of common objects from a color image and a
small number of tactile explorations (around 10). Our setup is easy to apply
and has potentials to help robots better perform grasping or manipulation tasks
on real-world objects.Comment: IROS 2018. The first two authors contributed equally to this wor
Haptic SLAM: An Ideal Observer Model for Bayesian Inference of Object Shape and Hand Pose from Contact Dynamics
Dynamic tactile exploration enables humans to seamlessly estimate the shape of objects and distinguish them from one another in the complete absence of visual information. Such a blind tactile exploration allows integrating information of the hand pose and contacts on the skin to form a coherent representation of the object shape. A principled way to understand the underlying neural computations of human haptic perception is through normative modelling. We propose a Bayesian perceptual model for recursive integration of noisy proprioceptive hand pose with noisy skin–object contacts. The model simultaneously forms an optimal estimate of the true hand pose and a representation of the explored shape in an object–centred coordinate system. A classification algorithm can, thus, be applied in order to distinguish among different objects solely based on the similarity of their representations. This enables the comparison, in real–time, of the shape of an object identified by human subjects with the shape of the same object predicted by our model using motion capture data. Therefore, our work provides a framework for a principled study of human haptic exploration of complex objects
- …