2 research outputs found

    Camera-to-Robot Pose Estimation from a Single Image

    Full text link
    We present an approach for estimating the pose of an external camera with respect to a robot using a single RGB image of the robot. The image is processed by a deep neural network to detect 2D projections of keypoints (such as joints) associated with the robot. The network is trained entirely on simulated data using domain randomization to bridge the reality gap. Perspective-n-point (PnP) is then used to recover the camera extrinsics, assuming that the camera intrinsics and joint configuration of the robot manipulator are known. Unlike classic hand-eye calibration systems, our method does not require an off-line calibration step. Rather, it is capable of computing the camera extrinsics from a single frame, thus opening the possibility of on-line calibration. We show experimental results for three different robots and camera sensors, demonstrating that our approach is able to achieve accuracy with a single frame that is comparable to that of classic off-line hand-eye calibration using multiple frames. With additional frames from a static pose, accuracy improves even further. Code, datasets, and pretrained models for three widely-used robot manipulators are made available.Comment: ICRA 2020. Project page is at https://research.nvidia.com/publication/2020-03_DREA

    Visuospatial Skill Learning for Robots

    Full text link
    A novel skill learning approach is proposed that allows a robot to acquire human-like visuospatial skills for object manipulation tasks. Visuospatial skills are attained by observing spatial relationships among objects through demonstrations. The proposed Visuospatial Skill Learning (VSL) is a goal-based approach that focuses on achieving a desired goal configuration of objects relative to one another while maintaining the sequence of operations. VSL is capable of learning and generalizing multi-operation skills from a single demonstration, while requiring minimum prior knowledge about the objects and the environment. In contrast to many existing approaches, VSL offers simplicity, efficiency and user-friendly human-robot interaction. We also show that VSL can be easily extended towards 3D object manipulation tasks, simply by employing point cloud processing techniques. In addition, a robot learning framework, VSL-SP, is proposed by integrating VSL, Imitation Learning, and a conventional planning method. In VSL-SP, the sequence of performed actions are learned using VSL, while the sensorimotor skills are learned using a conventional trajectory-based learning approach. such integration easily extends robot capabilities to novel situations, even by users without programming ability. In VSL-SP the internal planner of VSL is integrated with an existing action-level symbolic planner. Using the underlying constraints of the task and extracted symbolic predicates, identified by VSL, symbolic representation of the task is updated. Therefore the planner maintains a generalized representation of each skill as a reusable action, which can be used in planning and performed independently during the learning phase. The proposed approach is validated through several real-world experiments.Comment: 24 pages, 36 figure
    corecore