538 research outputs found
PoseFusion: Robust Object-in-Hand Pose Estimation with SelectLSTM
Accurate estimation of the relative pose between an object and a robot hand
is critical for many manipulation tasks. However, most of the existing
object-in-hand pose datasets use two-finger grippers and also assume that the
object remains fixed in the hand without any relative movements, which is not
representative of real-world scenarios. To address this issue, a 6D
object-in-hand pose dataset is proposed using a teleoperation method with an
anthropomorphic Shadow Dexterous hand. Our dataset comprises RGB-D images,
proprioception and tactile data, covering diverse grasping poses, finger
contact states, and object occlusions. To overcome the significant hand
occlusion and limited tactile sensor contact in real-world scenarios, we
propose PoseFusion, a hybrid multi-modal fusion approach that integrates the
information from visual and tactile perception channels. PoseFusion generates
three candidate object poses from three estimators (tactile only, visual only,
and visuo-tactile fusion), which are then filtered by a SelectLSTM network to
select the optimal pose, avoiding inferior fusion poses resulting from modality
collapse. Extensive experiments demonstrate the robustness and advantages of
our framework. All data and codes are available on the project website:
https://elevenjiang1.github.io/ObjectInHand-Dataset
Enhancing Generalizable 6D Pose Tracking of an In-Hand Object with Tactile Sensing
While holding and manipulating an object, humans track the object states
through vision and touch so as to achieve complex tasks. However, nowadays the
majority of robot research perceives object states just from visual signals,
hugely limiting the robotic manipulation abilities. This work presents a
tactile-enhanced generalizable 6D pose tracking design named TEG-Track to track
previously unseen in-hand objects. TEG-Track extracts tactile kinematic cues of
an in-hand object from consecutive tactile sensing signals. Such cues are
incorporated into a geometric-kinematic optimization scheme to enhance existing
generalizable visual trackers. To test our method in real scenarios and enable
future studies on generalizable visual-tactile tracking, we collect a real
visual-tactile in-hand object pose tracking dataset. Experiments show that
TEG-Track significantly improves state-of-the-art generalizable 6D pose
trackers in both synthetic and real cases
Hierarchical Graph Neural Networks for Proprioceptive 6D Pose Estimation of In-hand Objects
Robotic manipulation, in particular in-hand object manipulation, often
requires an accurate estimate of the object's 6D pose. To improve the accuracy
of the estimated pose, state-of-the-art approaches in 6D object pose estimation
use observational data from one or more modalities, e.g., RGB images, depth,
and tactile readings. However, existing approaches make limited use of the
underlying geometric structure of the object captured by these modalities,
thereby, increasing their reliance on visual features. This results in poor
performance when presented with objects that lack such visual features or when
visual features are simply occluded. Furthermore, current approaches do not
take advantage of the proprioceptive information embedded in the position of
the fingers. To address these limitations, in this paper: (1) we introduce a
hierarchical graph neural network architecture for combining multimodal (vision
and touch) data that allows for a geometrically informed 6D object pose
estimation, (2) we introduce a hierarchical message passing operation that
flows the information within and across modalities to learn a graph-based
object representation, and (3) we introduce a method that accounts for the
proprioceptive information for in-hand object representation. We evaluate our
model on a diverse subset of objects from the YCB Object and Model Set, and
show that our method substantially outperforms existing state-of-the-art work
in accuracy and robustness to occlusion. We also deploy our proposed framework
on a real robot and qualitatively demonstrate successful transfer to real
settings
Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search
This paper considers the problem of active object recognition using touch
only. The focus is on adaptively selecting a sequence of wrist poses that
achieves accurate recognition by enclosure grasps. It seeks to minimize the
number of touches and maximize recognition confidence. The actions are
formulated as wrist poses relative to each other, making the algorithm
independent of absolute workspace coordinates. The optimal sequence is
approximated by Monte Carlo tree search. We demonstrate results in a physics
engine and on a real robot. In the physics engine, most object instances were
recognized in at most 16 grasps. On a real robot, our method recognized objects
in 2--9 grasps and outperformed a greedy baseline.Comment: Accepted to International Conference on Intelligent Robots and
Systems (IROS) 201
ViHOPE: Visuotactile In-Hand Object 6D Pose Estimation with Shape Completion
In this letter, we introduce ViHOPE, a novel framework for estimating the 6D
pose of an in-hand object using visuotactile perception. Our key insight is
that the accuracy of the 6D object pose estimate can be improved by explicitly
completing the shape of the object. To this end, we introduce a novel
visuotactile shape completion module that uses a conditional Generative
Adversarial Network to complete the shape of an in-hand object based on
volumetric representation. This approach improves over prior works that
directly regress visuotactile observations to a 6D pose. By explicitly
completing the shape of the in-hand object and jointly optimizing the shape
completion and pose estimation tasks, we improve the accuracy of the 6D object
pose estimate. We train and test our model on a synthetic dataset and compare
it with the state-of-the-art. In the visuotactile shape completion task, we
outperform the state-of-the-art by 265% using the Intersection of Union metric
and achieve 88% lower Chamfer Distance. In the visuotactile pose estimation
task, we present results that suggest our framework reduces position and
angular errors by 35% and 64%, respectively. Furthermore, we ablate our
framework to confirm the gain on the 6D object pose estimate from explicitly
completing the shape. Ultimately, we show that our framework produces models
that are robust to sim-to-real transfer on a real-world robot platform.Comment: Accepted by RA-
Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search
This paper considers the problem of active object recognition using touch
only. The focus is on adaptively selecting a sequence of wrist poses that
achieves accurate recognition by enclosure grasps. It seeks to minimize the
number of touches and maximize recognition confidence. The actions are
formulated as wrist poses relative to each other, making the algorithm
independent of absolute workspace coordinates. The optimal sequence is
approximated by Monte Carlo tree search. We demonstrate results in a physics
engine and on a real robot. In the physics engine, most object instances were
recognized in at most 16 grasps. On a real robot, our method recognized objects
in 2--9 grasps and outperformed a greedy baseline.Comment: Accepted to International Conference on Intelligent Robots and
Systems (IROS) 201
Object Recognition and Localization : the Role of Tactile Sensors
Tactile sensors, because of their intrinsic insensitivity to lighting conditions and water turbidity, provide promising opportunities for augmenting the capabilities of vision sensors in applications involving object recognition and localization. This thesis presents two approaches for haptic object recognition and localization for ground and underwater environments. The first approach called Batch Ransac and Iterative Closest Point augmented Sequential Filter (BRICPSF) is based on an innovative combination of a sequential filter, Iterative-Closest-Point algorithm, and a feature-based Random Sampling and Consensus (RANSAC) algorithm for database matching. It can handle a large database of 3D-objects of complex shapes and performs a complete six-degree-of-freedom localization of static objects. The algorithms are validated by experimentation in simulation and using actual hardware. To our knowledge this is the first instance of haptic object recognition and localization in underwater environments. The second approach is biologically inspired, and provides a close integration between exploration and recognition. An edge following exploration strategy is developed that receives feedback from the current state of recognition. A recognition by parts approach is developed which uses BRICPSF for object part recognition. Object exploration is either directed to explore a part until it is successfully recognized, or is directed towards new parts to endorse the current recognition belief. This approach is validated by simulation experiments
- …