355 research outputs found

    Vision and Learning for Deliberative Monocular Cluttered Flight

    Full text link
    Cameras provide a rich source of information while being passive, cheap and lightweight for small and medium Unmanned Aerial Vehicles (UAVs). In this work we present the first implementation of receding horizon control, which is widely used in ground vehicles, with monocular vision as the only sensing mode for autonomous UAV flight in dense clutter. We make it feasible on UAVs via a number of contributions: novel coupling of perception and control via relevant and diverse, multiple interpretations of the scene around the robot, leveraging recent advances in machine learning to showcase anytime budgeted cost-sensitive feature selection, and fast non-linear regression for monocular depth prediction. We empirically demonstrate the efficacy of our novel pipeline via real world experiments of more than 2 kms through dense trees with a quadrotor built from off-the-shelf parts. Moreover our pipeline is designed to combine information from other modalities like stereo and lidar as well if available

    Vision-based localization methods under GPS-denied conditions

    Full text link
    This paper reviews vision-based localization methods in GPS-denied environments and classifies the mainstream methods into Relative Vision Localization (RVL) and Absolute Vision Localization (AVL). For RVL, we discuss the broad application of optical flow in feature extraction-based Visual Odometry (VO) solutions and introduce advanced optical flow estimation methods. For AVL, we review recent advances in Visual Simultaneous Localization and Mapping (VSLAM) techniques, from optimization-based methods to Extended Kalman Filter (EKF) based methods. We also introduce the application of offline map registration and lane vision detection schemes to achieve Absolute Visual Localization. This paper compares the performance and applications of mainstream methods for visual localization and provides suggestions for future studies.Comment: 32 pages, 15 figure

    Self-supervised learning: When is fusion of the primary and secondary sensor cue useful?

    Get PDF
    Self-supervised learning (SSL) is a reliable learning mechanism in which a robot enhances its perceptual capabilities. Typically, in SSL a trusted, primary sensor cue provides supervised training data to a secondary sensor cue. In this article, a theoretical analysis is performed on the fusion of the primary and secondary cue in a minimal model of SSL. A proof is provided that determines the specific conditions under which it is favorable to perform fusion. In short, it is favorable when (i) the prior on the target value is strong or (ii) the secondary cue is sufficiently accurate. The theoretical findings are validated with computational experiments. Subsequently, a real-world case study is performed to investigate if fusion in SSL is also beneficial when assumptions of the minimal model are not met. In particular, a flying robot learns to map pressure measurements to sonar height measurements and then fuses the two, resulting in better height estimation. Fusion is also beneficial in the opposite case, when pressure is the primary cue. The analysis and results are encouraging to study SSL fusion also for other robots and sensors

    Visual Guidance for Unmanned Aerial Vehicles with Deep Learning

    Full text link
    Unmanned Aerial Vehicles (UAVs) have been widely applied in the military and civilian domains. In recent years, the operation mode of UAVs is evolving from teleoperation to autonomous flight. In order to fulfill the goal of autonomous flight, a reliable guidance system is essential. Since the combination of Global Positioning System (GPS) and Inertial Navigation System (INS) systems cannot sustain autonomous flight in some situations where GPS can be degraded or unavailable, using computer vision as a primary method for UAV guidance has been widely explored. Moreover, GPS does not provide any information to the robot on the presence of obstacles. Stereo cameras have complex architecture and need a minimum baseline to generate disparity map. By contrast, monocular cameras are simple and require less hardware resources. Benefiting from state-of-the-art Deep Learning (DL) techniques, especially Convolutional Neural Networks (CNNs), a monocular camera is sufficient to extrapolate mid-level visual representations such as depth maps and optical flow (OF) maps from the environment. Therefore, the objective of this thesis is to develop a real-time visual guidance method for UAVs in cluttered environments using a monocular camera and DL. The three major tasks performed in this thesis are investigating the development of DL techniques and monocular depth estimation (MDE), developing real-time CNNs for MDE, and developing visual guidance methods on the basis of the developed MDE system. A comprehensive survey is conducted, which covers Structure from Motion (SfM)-based methods, traditional handcrafted feature-based methods, and state-of-the-art DL-based methods. More importantly, it also investigates the application of MDE in robotics. Based on the survey, two CNNs for MDE are developed. In addition to promising accuracy performance, these two CNNs run at high frame rates (126 fps and 90 fps respectively), on a single modest power Graphical Processing Unit (GPU). As regards the third task, the visual guidance for UAVs is first developed on top of the designed MDE networks. To improve the robustness of UAV guidance, OF maps are integrated into the developed visual guidance method. A cross-attention module is applied to fuse the features learned from the depth maps and OF maps. The fused features are then passed through a deep reinforcement learning (DRL) network to generate the policy for guiding the flight of UAV. Additionally, a simulation framework is developed which integrates AirSim, Unreal Engine and PyTorch. The effectiveness of the developed visual guidance method is validated through extensive experiments in the simulation framework

    An Autonomous Surface Vehicle for Long Term Operations

    Full text link
    Environmental monitoring of marine environments presents several challenges: the harshness of the environment, the often remote location, and most importantly, the vast area it covers. Manual operations are time consuming, often dangerous, and labor intensive. Operations from oceanographic vessels are costly and limited to open seas and generally deeper bodies of water. In addition, with lake, river, and ocean shoreline being a finite resource, waterfront property presents an ever increasing valued commodity, requiring exploration and continued monitoring of remote waterways. In order to efficiently explore and monitor currently known marine environments as well as reach and explore remote areas of interest, we present a design of an autonomous surface vehicle (ASV) with the power to cover large areas, the payload capacity to carry sufficient power and sensor equipment, and enough fuel to remain on task for extended periods. An analysis of the design and a discussion on lessons learned during deployments is presented in this paper.Comment: In proceedings of MTS/IEEE OCEANS, 2018, Charlesto

    Vision-based Learning for Drones: A Survey

    Full text link
    Drones as advanced cyber-physical systems are undergoing a transformative shift with the advent of vision-based learning, a field that is rapidly gaining prominence due to its profound impact on drone autonomy and functionality. Different from existing task-specific surveys, this review offers a comprehensive overview of vision-based learning in drones, emphasizing its pivotal role in enhancing their operational capabilities under various scenarios. We start by elucidating the fundamental principles of vision-based learning, highlighting how it significantly improves drones' visual perception and decision-making processes. We then categorize vision-based control methods into indirect, semi-direct, and end-to-end approaches from the perception-control perspective. We further explore various applications of vision-based drones with learning capabilities, ranging from single-agent systems to more complex multi-agent and heterogeneous system scenarios, and underscore the challenges and innovations characterizing each area. Finally, we explore open questions and potential solutions, paving the way for ongoing research and development in this dynamic and rapidly evolving field. With growing large language models (LLMs) and embodied intelligence, vision-based learning for drones provides a promising but challenging road towards artificial general intelligence (AGI) in 3D physical world
    corecore