65 research outputs found

    Motion control using optical flow of sparse image features

    Full text link
    Reactive motion planning and local navigation of robots remains a significant challenge in the motion control of robotic vehicles. This thesis presents new results on vision guided navigation using optical flow. By detecting key image features, calculating optical flow and leveraging time-to-transit (tau) as a feedback signal, control architectures can steer a vehicle so as to avoid obstacles while simultaneously using them as navigation beacons. Averaging and balancing tau over multiple image features successfully guides a vehicle along a corridor while avoiding looming objects in the periphery. In addition, the averaging strategy deemphasizes noise associated with rotationally induced flow fields, mitigating risks of positive feedback akin to the Larsen effect. A recently developed, biologically inspired, binary-key point description algorithm, FReaK, offers process speed-ups that make vision-based feedback signals achievable. A Parrot ARDrone2 has proven to be a reliable platform for testing the architecture and has demonstrated the control law's effectiveness in using time-to-transit calculations for real-time navigation

    Deep Learning-Based 6-DoF Object Pose Estimation With Synthetic Data: A Case Study in Underwater Environments

    Get PDF
    In this thesis we aim to address the image based 6-DoF pose estimation problem, or 3D pose estimation problem, for Autonomous Underwater Vehicles (AUVs). The results of the object pose estimation will be used, for example, to estimate the global location of the AUV or to approach more accurately the underwater infrastructures. Actually, an autonomous robot or a team of autonomous robots need accurate location skills to safely and effectively move within an underwater environment, where communications are sparse and unreliable, and to accomplish high-level tasks such as: underwater exploration, mapping of the surrounding environment, multi-robot conveyance and many other multi-robot problems. Several state-of-the-art approaches will be analysed and tested on real datasets. Collecting underwater images and providing them with an accurate ground-truth estimate of the object's pose is an expansive and extremely time-consuming activity To this end, we addressed the problem using only synthetic datasets. In fact, it was not possible to use the standard datasets used in the analyzed papers, since they are datasets with objects and conditions very different from those in which the AUVs operate. Hence, we exploited an unpaired image-to-image translation network is employed to bridge the gap between the rendered and the real images, producing photorealistic synthetic training images. Promising preliminary results confirm the goodness of the made choices.In this thesis we aim to address the image based 6-DoF pose estimation problem, or 3D pose estimation problem, for Autonomous Underwater Vehicles (AUVs). The results of the object pose estimation will be used, for example, to estimate the global location of the AUV or to approach more accurately the underwater infrastructures. Actually, an autonomous robot or a team of autonomous robots need accurate location skills to safely and effectively move within an underwater environment, where communications are sparse and unreliable, and to accomplish high-level tasks such as: underwater exploration, mapping of the surrounding environment, multi-robot conveyance and many other multi-robot problems. Several state-of-the-art approaches will be analysed and tested on real datasets. Collecting underwater images and providing them with an accurate ground-truth estimate of the object's pose is an expansive and extremely time-consuming activity To this end, we addressed the problem using only synthetic datasets. In fact, it was not possible to use the standard datasets used in the analyzed papers, since they are datasets with objects and conditions very different from those in which the AUVs operate. Hence, we exploited an unpaired image-to-image translation network is employed to bridge the gap between the rendered and the real images, producing photorealistic synthetic training images. Promising preliminary results confirm the goodness of the made choices

    Basic level scene understanding: categories, attributes and structures

    Get PDF
    A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the richly annotated SUN database which is a collection of annotated images spanning 908 different scene categories with object, attribute, and geometric labels for many scenes. This database allows us to systematically study the space of scenes and to establish a benchmark for scene and object recognition. We augment the categorical SUN database with 102 scene attributes for every image and explore attribute recognition. Finally, we present an integrated system to extract the 3D structure of the scene and objects depicted in an image.Google U.S./Canada Ph.D. Fellowship in Computer VisionNational Science Foundation (U.S.) (grant 1016862)Google Faculty Research AwardNational Science Foundation (U.S.) (Career Award 1149853)National Science Foundation (U.S.) (Career Award 0747120)United States. Office of Naval Research. Multidisciplinary University Research Initiative (N000141010933

    Deep neural networks for marine debris detection in sonar images

    Get PDF
    Garbage and waste disposal is one of the biggest challenges currently faced by mankind. Proper waste disposal and recycling is a must in any sustainable community, and in many coastal areas there is significant water pollution in the form of floating or submerged garbage. This is called marine debris. It is estimated that 6.4 million tonnes of marine debris enter water environments every year [McIlgorm et al. 2008, APEC Marine Resource Conservation WG], with 8 million items entering each day. An unknown fraction of this sinks to the bottom of water bodies. Submerged marine debris threatens marine life, and for shallow coastal areas, it can also threaten fishing vessels [Iñiguez et al. 2016, Renewable and Sustainable Energy Reviews]. Submerged marine debris typically stays in the environment for a long time (20+ years), and consists of materials that can be recycled, such as metals, plastics, glass, etc. Many of these items should not be disposed in water bodies as this has a negative effect in the environment and human health. Encouraged by the advances in Computer Vision from the use Deep Learning, we propose the use of Deep Neural Networks (DNNs) to survey and detect marine debris in the bottom of water bodies (seafloor, lake and river beds) from Forward-Looking Sonar (FLS) images. This thesis performs a comprehensive evaluation on the use of DNNs for the problem of marine debris detection in FLS images, as well as related problems such as image classification, matching, and detection proposals. We do this in a dataset of 2069 FLS images that we captured with an ARIS Explorer 3000 sensor on marine debris objects lying in the floor of a small water tank. We had issues with the sensor in a real world underwater environment that motivated the use of a water tank. The objects we used to produce this dataset contain typical household marine debris and distractor marine objects (tires, hooks, valves, etc), divided in 10 classes plus a background class. Our results show that for the evaluated tasks, DNNs area superior technique than the corresponding state of the art. There are large gains particularly for the matching and detection proposal tasks. We also study the effect of sample complexity and object size in many tasks, which is valuable information for practitioners. We expect that our results will advance the objective of using Autonomous Underwater Vehicles to automatically survey, detect and collect marine debris from underwater environments

    Online Self-Supervised Thermal Water Segmentation for Aerial Vehicles

    Full text link
    We present a new method to adapt an RGB-trained water segmentation network to target-domain aerial thermal imagery using online self-supervision by leveraging texture and motion cues as supervisory signals. This new thermal capability enables current autonomous aerial robots operating in near-shore environments to perform tasks such as visual navigation, bathymetry, and flow tracking at night. Our method overcomes the problem of scarce and difficult-to-obtain near-shore thermal data that prevents the application of conventional supervised and unsupervised methods. In this work, we curate the first aerial thermal near-shore dataset, show that our approach outperforms fully-supervised segmentation models trained on limited target-domain thermal data, and demonstrate real-time capabilities onboard an Nvidia Jetson embedded computing platform. Code and datasets used in this work will be available at: https://github.com/connorlee77/uav-thermal-water-segmentation.Comment: 8 pages, 4 figures, 3 table

    Distributed scene reconstruction from multiple mobile platforms

    Get PDF
    Recent research on mobile robotics has produced new designs that provide house-hold robots with omnidirectional motion. The image sensor embedded in these devices motivates the application of 3D vision techniques on them for navigation and mapping purposes. In addition to this, distributed cheapsensing systems acting as unitary entity have recently been discovered as an efficient alternative to expensive mobile equipment. In this work we present an implementation of a visual reconstruction method, structure from motion (SfM), on a low-budget, omnidirectional mobile platform, and extend this method to distributed 3D scene reconstruction with several instances of such a platform. Our approach overcomes the challenges yielded by the plaform. The unprecedented levels of noise produced by the image compression typical of the platform is processed by our feature filtering methods, which ensure suitable feature matching populations for epipolar geometry estimation by means of a strict quality-based feature selection. The robust pose estimation algorithms implemented, along with a novel feature tracking system, enable our incremental SfM approach to novelly deal with ill-conditioned inter-image configurations provoked by the omnidirectional motion. The feature tracking system developed efficiently manages the feature scarcity produced by noise and outputs quality feature tracks, which allow robust 3D mapping of a given scene even if - due to noise - their length is shorter than what it is usually assumed for performing stable 3D reconstructions. The distributed reconstruction from multiple instances of SfM is attained by applying loop-closing techniques. Our multiple reconstruction system merges individual 3D structures and resolves the global scale problem with minimal overlaps, whereas in the literature 3D mapping is obtained by overlapping stretches of sequences. The performance of this system is demonstrated in the 2-session case. The management of noise, the stability against ill-configurations and the robustness of our SfM system is validated on a number of experiments and compared with state-of-the-art approaches. Possible future research areas are also discussed

    Photogrammetric suite to manage the survey workflow in challenging environments and conditions

    Get PDF
    The present work is intended in providing new and innovative instruments to support the photogrammetric survey workflow during all its phases. A suite of tools has been conceived in order to manage the planning, the acquisition, the post-processing and the restitution steps, with particular attention to the rigorousness of the approach and to the final precision. The main focus of the research has been the implementation of the tool MAGO, standing for Adaptive Mesh for Orthophoto Generation. Its novelty consists in the possibility to automatically reconstruct \u201cunrolled\u201d orthophotos of adjacent fa\ue7ades of a building using the point cloud, instead of the mesh, as input source for the orthophoto reconstruction. The second tool has been conceived as a photogrammetric procedure based on Bundle Block Adjustment. The same issue is analysed from two mirrored perspectives: on the one hand, the use of moving cameras in a static scenario in order to manage real-time indoor navigation; on the other hand, the use of static cameras in a moving scenario in order to achieve the simultaneously reconstruction of the 3D model of the changing object. A third tool named U.Ph.O., standing for Unmanned Photogrammetric Office, has been integrated with a new module. The general aim is on the one hand to plan the photogrammetric survey considering the expected precision, computed on the basis of a network simulation, and on the other hand to check if the achieved survey has been collected compatibly with the planned conditions. The provided integration concerns the treatment of surfaces with a generic orientation further than the ones with a planimetric development. After a brief introduction, a general description about the photogrammetric principles is given in the first chapter of the dissertation; a chapter follows about the parallelism between Photogrammetry and Computer Vision and the contribution of this last in the development of the described tools. The third chapter specifically regards, indeed, the implemented software and tools, while the fourth contains the training test and the validation. Finally, conclusions and future perspectives are reported
    corecore