3,469 research outputs found

    J-MOD2^{2}: Joint Monocular Obstacle Detection and Depth Estimation

    Full text link
    In this work, we propose an end-to-end deep architecture that jointly learns to detect obstacles and estimate their depth for MAV flight applications. Most of the existing approaches either rely on Visual SLAM systems or on depth estimation models to build 3D maps and detect obstacles. However, for the task of avoiding obstacles this level of complexity is not required. Recent works have proposed multi task architectures to both perform scene understanding and depth estimation. We follow their track and propose a specific architecture to jointly estimate depth and obstacles, without the need to compute a global map, but maintaining compatibility with a global SLAM system if needed. The network architecture is devised to exploit the joint information of the obstacle detection task, that produces more reliable bounding boxes, with the depth estimation one, increasing the robustness of both to scenario changes. We call this architecture J-MOD2^{2}. We test the effectiveness of our approach with experiments on sequences with different appearance and focal lengths and compare it to SotA multi task methods that jointly perform semantic segmentation and depth estimation. In addition, we show the integration in a full system using a set of simulated navigation experiments where a MAV explores an unknown scenario and plans safe trajectories by using our detection model

    Towards Monocular Vision based Obstacle Avoidance through Deep Reinforcement Learning

    Full text link
    Obstacle avoidance is a fundamental requirement for autonomous robots which operate in, and interact with, the real world. When perception is limited to monocular vision avoiding collision becomes significantly more challenging due to the lack of 3D information. Conventional path planners for obstacle avoidance require tuning a number of parameters and do not have the ability to directly benefit from large datasets and continuous use. In this paper, a dueling architecture based deep double-Q network (D3QN) is proposed for obstacle avoidance, using only monocular RGB vision. Based on the dueling and double-Q mechanisms, D3QN can efficiently learn how to avoid obstacles in a simulator even with very noisy depth information predicted from RGB image. Extensive experiments show that D3QN enables twofold acceleration on learning compared with a normal deep Q network and the models trained solely in virtual environments can be directly transferred to real robots, generalizing well to various new environments with previously unseen dynamic objects.Comment: Accepted by RSS 2017 workshop New Frontiers for Deep Learning in Robotic

    Monocular Vision as a Range Sensor

    Get PDF
    One of the most important abilities for a mobile robot is detecting obstacles in order to avoid collisions. Building a map of these obstacles is the next logical step. Most robots to date have used sensors such as passive or active infrared, sonar or laser range finders to locate obstacles in their path. In contrast, this work uses a single colour camera as the only sensor, and consequently the robot must obtain range information from the camera images. We propose simple methods for determining the range to the nearest obstacle in any direction in the robot’s field of view, referred to as the Radial Obstacle Profile. The ROP can then be used to determine the amount of rotation between two successive images, which is important for constructing a 360º view of the surrounding environment as part of map construction

    Vision and Learning for Deliberative Monocular Cluttered Flight

    Full text link
    Cameras provide a rich source of information while being passive, cheap and lightweight for small and medium Unmanned Aerial Vehicles (UAVs). In this work we present the first implementation of receding horizon control, which is widely used in ground vehicles, with monocular vision as the only sensing mode for autonomous UAV flight in dense clutter. We make it feasible on UAVs via a number of contributions: novel coupling of perception and control via relevant and diverse, multiple interpretations of the scene around the robot, leveraging recent advances in machine learning to showcase anytime budgeted cost-sensitive feature selection, and fast non-linear regression for monocular depth prediction. We empirically demonstrate the efficacy of our novel pipeline via real world experiments of more than 2 kms through dense trees with a quadrotor built from off-the-shelf parts. Moreover our pipeline is designed to combine information from other modalities like stereo and lidar as well if available
    corecore