2,194 research outputs found

    SHOWMe: Benchmarking Object-agnostic Hand-Object 3D Reconstruction

    Full text link
    Recent hand-object interaction datasets show limited real object variability and rely on fitting the MANO parametric model to obtain groundtruth hand shapes. To go beyond these limitations and spur further research, we introduce the SHOWMe dataset which consists of 96 videos, annotated with real and detailed hand-object 3D textured meshes. Following recent work, we consider a rigid hand-object scenario, in which the pose of the hand with respect to the object remains constant during the whole video sequence. This assumption allows us to register sub-millimetre-precise groundtruth 3D scans to the image sequences in SHOWMe. Although simpler, this hypothesis makes sense in terms of applications where the required accuracy and level of detail is important eg., object hand-over in human-robot collaboration, object scanning, or manipulation and contact point analysis. Importantly, the rigidity of the hand-object systems allows to tackle video-based 3D reconstruction of unknown hand-held objects using a 2-stage pipeline consisting of a rigid registration step followed by a multi-view reconstruction (MVR) part. We carefully evaluate a set of non-trivial baselines for these two stages and show that it is possible to achieve promising object-agnostic 3D hand-object reconstructions employing an SfM toolbox or a hand pose estimator to recover the rigid transforms and off-the-shelf MVR algorithms. However, these methods remain sensitive to the initial camera pose estimates which might be imprecise due to lack of textures on the objects or heavy occlusions of the hands, leaving room for improvements in the reconstruction. Code and dataset are available at https://europe.naverlabs.com/research/showmeComment: Paper and Appendix, Accepted in ACVR workshop at ICCV conferenc

    Deformable Objects for Virtual Environments

    Get PDF

    Self-Supervised Object-in-Gripper Segmentation from Robotic Motions

    Get PDF
    Accurate object segmentation is a crucial task in the context of robotic manipulation. However, creating sufficient annotated training data for neural networks is particularly time consuming and often requires manual labeling. To this end, we propose a simple, yet robust solution for learning to segment unknown objects grasped by a robot. Specifically, we exploit motion and temporal cues in RGB video sequences. Using optical flow estimation we first learn to predict segmentation masks of our given manipulator. Then, these annotations are used in combination with motion cues to automatically distinguish between background, manipulator and unknown, grasped object. In contrast to existing systems our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data. We perform a thorough comparison with alternative baselines and approaches from literature. The object masks and views are shown to be suitable training data for segmentation networks that generalize to novel environments and also allow for watertight 3D reconstruction.Comment: 15 pages, 11 figures. Video: https://www.youtube.com/watch?v=srEwuuIIgz

    Integrating Vision and Physical Interaction for Discovery, Segmentation and Grasping of Unknown Objects

    Get PDF
    In dieser Arbeit werden Verfahren der Bildverarbeitung und die Fähigkeit humanoider Roboter, mit ihrer Umgebung physisch zu interagieren, in engem Zusammenspiel eingesetzt, um unbekannte Objekte zu identifizieren, sie vom Hintergrund und anderen Objekten zu trennen, und letztendlich zu greifen. Im Verlauf dieser interaktiven Exploration werden außerdem Eigenschaften des Objektes wie etwa sein Aussehen und seine Form ermittelt
    • …
    corecore