7,339 research outputs found

    Deep Object-Centric Representations for Generalizable Robot Learning

    Full text link
    Robotic manipulation in complex open-world scenarios requires both reliable physical manipulation skills and effective and generalizable perception. In this paper, we propose a method where general purpose pretrained visual models serve as an object-centric prior for the perception system of a learned policy. We devise an object-level attentional mechanism that can be used to determine relevant objects from a few trajectories or demonstrations, and then immediately incorporate those objects into a learned policy. A task-independent meta-attention locates possible objects in the scene, and a task-specific attention identifies which objects are predictive of the trajectories. The scope of the task-specific attention is easily adjusted by showing demonstrations with distractor objects or with diverse relevant objects. Our results indicate that this approach exhibits good generalization across object instances using very few samples, and can be used to learn a variety of manipulation tasks using reinforcement learning

    OMap: An assistive solution for identifying and localizing objects in a semi-structured environment

    Get PDF
    A system capable of detection and localization of objects of interest in a semi-structured environment will enhance the quality of life of people who are blind or visually impaired. Towards building such a system, this thesis presents a personalized real-time system called O\u27Map that finds misplaced/moved personal items and localizes them with respect to known landmarks. First, we adopted a participatory design approach to identify users’ need and functionalities of the system. Second, we used the concept from system thinking and design thinking to develop a real-time object recognition engine that was optimized to run on low form factor devices. The object recognition engine finds robust correspondences between the query image and item templates using K-D tree of invariant feature descriptor with two nearest neighbors and ratio test. Quantitative evaluation demonstrates that O\u27Map identifies object of interest with an average F-measure of 0.9650

    SegICP: Integrated Deep Semantic Segmentation and Pose Estimation

    Full text link
    Recent robotic manipulation competitions have highlighted that sophisticated robots still struggle to achieve fast and reliable perception of task-relevant objects in complex, realistic scenarios. To improve these systems' perceptive speed and robustness, we present SegICP, a novel integrated solution to object recognition and pose estimation. SegICP couples convolutional neural networks and multi-hypothesis point cloud registration to achieve both robust pixel-wise semantic segmentation as well as accurate and real-time 6-DOF pose estimation for relevant objects. Our architecture achieves 1cm position error and <5^\circ$ angle error in real time without an initial seed. We evaluate and benchmark SegICP against an annotated dataset generated by motion capture.Comment: IROS camera-read

    Non-line-of-sight tracking of people at long range

    Get PDF
    A remote-sensing system that can determine the position of hidden objects has applications in many critical real-life scenarios, such as search and rescue missions and safe autonomous driving. Previous work has shown the ability to range and image objects hidden from the direct line of sight, employing advanced optical imaging technologies aimed at small objects at short range. In this work we demonstrate a long-range tracking system based on single laser illumination and single-pixel single-photon detection. This enables us to track one or more people hidden from view at a stand-off distance of over 50~m. These results pave the way towards next generation LiDAR systems that will reconstruct not only the direct-view scene but also the main elements hidden behind walls or corners
    • …
    corecore