2,689 research outputs found

    Dense 3D Object Reconstruction from a Single Depth View

    Get PDF
    In this paper, we propose a novel approach, 3D-RecGAN++, which reconstructs the complete 3D structure of a given object from a single arbitrary depth view using generative adversarial networks. Unlike existing work which typically requires multiple views of the same object or class labels to recover the full 3D geometry, the proposed 3D-RecGAN++ only takes the voxel grid representation of a depth view of the object as input, and is able to generate the complete 3D occupancy grid with a high resolution of 256^3 by recovering the occluded/missing regions. The key idea is to combine the generative capabilities of autoencoders and the conditional Generative Adversarial Networks (GAN) framework, to infer accurate and fine-grained 3D structures of objects in high-dimensional voxel space. Extensive experiments on large synthetic datasets and real-world Kinect datasets show that the proposed 3D-RecGAN++ significantly outperforms the state of the art in single view 3D object reconstruction, and is able to reconstruct unseen types of objects.Comment: TPAMI 2018. Code and data are available at: https://github.com/Yang7879/3D-RecGAN-extended. This article extends from arXiv:1708.0796

    Tactile Mapping and Localization from High-Resolution Tactile Imprints

    Full text link
    This work studies the problem of shape reconstruction and object localization using a vision-based tactile sensor, GelSlim. The main contributions are the recovery of local shapes from contact, an approach to reconstruct the tactile shape of objects from tactile imprints, and an accurate method for object localization of previously reconstructed objects. The algorithms can be applied to a large variety of 3D objects and provide accurate tactile feedback for in-hand manipulation. Results show that by exploiting the dense tactile information we can reconstruct the shape of objects with high accuracy and do on-line object identification and localization, opening the door to reactive manipulation guided by tactile sensing. We provide videos and supplemental information in the project's website http://web.mit.edu/mcube/research/tactile_localization.html.Comment: ICRA 2019, 7 pages, 7 figures. Website: http://web.mit.edu/mcube/research/tactile_localization.html Video: https://youtu.be/uMkspjmDbq

    Autonomous task-based grasping for mobile manipulators

    Get PDF
    A fully integrated grasping system for a mobile manipulator to grasp an unknown object of interest (OI) in an unknown environment is presented. The system autonomously scans its environment, models the OI, plans and executes a grasp, while taking into account base pose uncertainty and obstacles in its way to reach the object. Due to inherent line of sight limitations in sensing, a single scan of the OI often does not reveal enough information to complete grasp analysis; as a result, our system autonomously builds a model of an object via multiple scans from different locations until a grasp can be performed. A volumetric next-best-view (NBV) algorithm is used to model an arbitrary object and terminates modelling when grasp poses are discovered on a partially observed object. Two key sets of experiments are presented: i) modelling and registration error in the OI point cloud model is reduced by selecting viewpoints with more scan overlap, and ii) model construction and grasps are successfully achieved while experiencing base pose uncertainty. A generalized algorithm is presented to discover grasp pose solutions for multiple grasp types for a multi-fingered mechanical gripper using sensed point clouds. The algorithm introduces two key ideas: 1) a histogram of finger contact normals is used to represent a grasp “shape” to guide a gripper orientation search in a histogram of object(s) surface normals, and 2) voxel grid representations of gripper and object(s) are cross-correlated to match finger contact points, i.e. grasp “size”, to discover a grasp pose. Constraints, such as collisions with neighbouring objects, are incorporated in the cross-correlation computation. Simulations and preliminary experiments show that 1) grasp poses for three grasp types are found in near real-time, 2) grasp pose solutions are consistent with respect to voxel resolution changes for both partial and complete point cloud scans, 3) a planned grasp pose is executed with a mechanical gripper, and 4) grasp overlap is presented as a feature to identify regions on a partial object model ideal for object transfer or securing an object

    Hand Keypoint Detection in Single Images using Multiview Bootstrapping

    Full text link
    We present an approach that uses a multi-camera system to train fine-grained detectors for keypoints that are prone to occlusion, such as the joints of a hand. We call this procedure multiview bootstrapping: first, an initial keypoint detector is used to produce noisy labels in multiple views of the hand. The noisy detections are then triangulated in 3D using multiview geometry or marked as outliers. Finally, the reprojected triangulations are used as new labeled training data to improve the detector. We repeat this process, generating more labeled data in each iteration. We derive a result analytically relating the minimum number of views to achieve target true and false positive rates for a given detector. The method is used to train a hand keypoint detector for single images. The resulting keypoint detector runs in realtime on RGB images and has accuracy comparable to methods that use depth sensors. The single view detector, triangulated over multiple views, enables 3D markerless hand motion capture with complex object interactions.Comment: CVPR 201
    corecore