538 research outputs found

    PoseFusion: Robust Object-in-Hand Pose Estimation with SelectLSTM

    Full text link
    Accurate estimation of the relative pose between an object and a robot hand is critical for many manipulation tasks. However, most of the existing object-in-hand pose datasets use two-finger grippers and also assume that the object remains fixed in the hand without any relative movements, which is not representative of real-world scenarios. To address this issue, a 6D object-in-hand pose dataset is proposed using a teleoperation method with an anthropomorphic Shadow Dexterous hand. Our dataset comprises RGB-D images, proprioception and tactile data, covering diverse grasping poses, finger contact states, and object occlusions. To overcome the significant hand occlusion and limited tactile sensor contact in real-world scenarios, we propose PoseFusion, a hybrid multi-modal fusion approach that integrates the information from visual and tactile perception channels. PoseFusion generates three candidate object poses from three estimators (tactile only, visual only, and visuo-tactile fusion), which are then filtered by a SelectLSTM network to select the optimal pose, avoiding inferior fusion poses resulting from modality collapse. Extensive experiments demonstrate the robustness and advantages of our framework. All data and codes are available on the project website: https://elevenjiang1.github.io/ObjectInHand-Dataset

    Enhancing Generalizable 6D Pose Tracking of an In-Hand Object with Tactile Sensing

    Full text link
    While holding and manipulating an object, humans track the object states through vision and touch so as to achieve complex tasks. However, nowadays the majority of robot research perceives object states just from visual signals, hugely limiting the robotic manipulation abilities. This work presents a tactile-enhanced generalizable 6D pose tracking design named TEG-Track to track previously unseen in-hand objects. TEG-Track extracts tactile kinematic cues of an in-hand object from consecutive tactile sensing signals. Such cues are incorporated into a geometric-kinematic optimization scheme to enhance existing generalizable visual trackers. To test our method in real scenarios and enable future studies on generalizable visual-tactile tracking, we collect a real visual-tactile in-hand object pose tracking dataset. Experiments show that TEG-Track significantly improves state-of-the-art generalizable 6D pose trackers in both synthetic and real cases

    Hierarchical Graph Neural Networks for Proprioceptive 6D Pose Estimation of In-hand Objects

    Full text link
    Robotic manipulation, in particular in-hand object manipulation, often requires an accurate estimate of the object's 6D pose. To improve the accuracy of the estimated pose, state-of-the-art approaches in 6D object pose estimation use observational data from one or more modalities, e.g., RGB images, depth, and tactile readings. However, existing approaches make limited use of the underlying geometric structure of the object captured by these modalities, thereby, increasing their reliance on visual features. This results in poor performance when presented with objects that lack such visual features or when visual features are simply occluded. Furthermore, current approaches do not take advantage of the proprioceptive information embedded in the position of the fingers. To address these limitations, in this paper: (1) we introduce a hierarchical graph neural network architecture for combining multimodal (vision and touch) data that allows for a geometrically informed 6D object pose estimation, (2) we introduce a hierarchical message passing operation that flows the information within and across modalities to learn a graph-based object representation, and (3) we introduce a method that accounts for the proprioceptive information for in-hand object representation. We evaluate our model on a diverse subset of objects from the YCB Object and Model Set, and show that our method substantially outperforms existing state-of-the-art work in accuracy and robustness to occlusion. We also deploy our proposed framework on a real robot and qualitatively demonstrate successful transfer to real settings

    Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search

    Full text link
    This paper considers the problem of active object recognition using touch only. The focus is on adaptively selecting a sequence of wrist poses that achieves accurate recognition by enclosure grasps. It seeks to minimize the number of touches and maximize recognition confidence. The actions are formulated as wrist poses relative to each other, making the algorithm independent of absolute workspace coordinates. The optimal sequence is approximated by Monte Carlo tree search. We demonstrate results in a physics engine and on a real robot. In the physics engine, most object instances were recognized in at most 16 grasps. On a real robot, our method recognized objects in 2--9 grasps and outperformed a greedy baseline.Comment: Accepted to International Conference on Intelligent Robots and Systems (IROS) 201

    ViHOPE: Visuotactile In-Hand Object 6D Pose Estimation with Shape Completion

    Full text link
    In this letter, we introduce ViHOPE, a novel framework for estimating the 6D pose of an in-hand object using visuotactile perception. Our key insight is that the accuracy of the 6D object pose estimate can be improved by explicitly completing the shape of the object. To this end, we introduce a novel visuotactile shape completion module that uses a conditional Generative Adversarial Network to complete the shape of an in-hand object based on volumetric representation. This approach improves over prior works that directly regress visuotactile observations to a 6D pose. By explicitly completing the shape of the in-hand object and jointly optimizing the shape completion and pose estimation tasks, we improve the accuracy of the 6D object pose estimate. We train and test our model on a synthetic dataset and compare it with the state-of-the-art. In the visuotactile shape completion task, we outperform the state-of-the-art by 265% using the Intersection of Union metric and achieve 88% lower Chamfer Distance. In the visuotactile pose estimation task, we present results that suggest our framework reduces position and angular errors by 35% and 64%, respectively. Furthermore, we ablate our framework to confirm the gain on the 6D object pose estimate from explicitly completing the shape. Ultimately, we show that our framework produces models that are robust to sim-to-real transfer on a real-world robot platform.Comment: Accepted by RA-

    Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search

    Full text link
    This paper considers the problem of active object recognition using touch only. The focus is on adaptively selecting a sequence of wrist poses that achieves accurate recognition by enclosure grasps. It seeks to minimize the number of touches and maximize recognition confidence. The actions are formulated as wrist poses relative to each other, making the algorithm independent of absolute workspace coordinates. The optimal sequence is approximated by Monte Carlo tree search. We demonstrate results in a physics engine and on a real robot. In the physics engine, most object instances were recognized in at most 16 grasps. On a real robot, our method recognized objects in 2--9 grasps and outperformed a greedy baseline.Comment: Accepted to International Conference on Intelligent Robots and Systems (IROS) 201

    Object Recognition and Localization : the Role of Tactile Sensors

    Get PDF
    Tactile sensors, because of their intrinsic insensitivity to lighting conditions and water turbidity, provide promising opportunities for augmenting the capabilities of vision sensors in applications involving object recognition and localization. This thesis presents two approaches for haptic object recognition and localization for ground and underwater environments. The first approach called Batch Ransac and Iterative Closest Point augmented Sequential Filter (BRICPSF) is based on an innovative combination of a sequential filter, Iterative-Closest-Point algorithm, and a feature-based Random Sampling and Consensus (RANSAC) algorithm for database matching. It can handle a large database of 3D-objects of complex shapes and performs a complete six-degree-of-freedom localization of static objects. The algorithms are validated by experimentation in simulation and using actual hardware. To our knowledge this is the first instance of haptic object recognition and localization in underwater environments. The second approach is biologically inspired, and provides a close integration between exploration and recognition. An edge following exploration strategy is developed that receives feedback from the current state of recognition. A recognition by parts approach is developed which uses BRICPSF for object part recognition. Object exploration is either directed to explore a part until it is successfully recognized, or is directed towards new parts to endorse the current recognition belief. This approach is validated by simulation experiments
    corecore