65 research outputs found
Motion control using optical flow of sparse image features
Reactive motion planning and local navigation of robots remains a significant challenge in the motion control of robotic vehicles. This thesis presents new results on vision guided navigation using optical flow. By detecting key image features, calculating optical flow and leveraging time-to-transit (tau) as a feedback signal, control architectures can steer a vehicle so as to avoid obstacles while simultaneously using them as navigation beacons.
Averaging and balancing tau over multiple image features successfully guides a vehicle along a corridor while avoiding looming objects in the periphery. In addition, the averaging strategy deemphasizes noise associated with rotationally induced flow fields, mitigating risks of positive feedback akin to the Larsen effect.
A recently developed, biologically inspired, binary-key point description algorithm, FReaK, offers process speed-ups that make vision-based feedback signals achievable. A Parrot ARDrone2 has proven to be a reliable platform for testing the architecture and has demonstrated the control law's effectiveness in using time-to-transit calculations for real-time navigation
Deep Learning-Based 6-DoF Object Pose Estimation With Synthetic Data: A Case Study in Underwater Environments
In this thesis we aim to address the image based 6-DoF pose estimation problem, or 3D pose estimation problem, for Autonomous Underwater Vehicles (AUVs). The results of the object pose estimation will be used, for example, to estimate the global location of the AUV or to approach more accurately the underwater infrastructures. Actually, an autonomous robot or a team of autonomous robots need accurate location skills to safely and effectively move within an underwater environment, where communications are sparse and unreliable, and to accomplish high-level tasks such as: underwater exploration, mapping of the surrounding environment, multi-robot conveyance and many other multi-robot problems.
Several state-of-the-art approaches will be analysed and tested on real datasets.
Collecting underwater images and providing them with an accurate ground-truth estimate of the object's pose is an expansive and extremely time-consuming activity
To this end, we addressed the problem using only synthetic datasets. In fact, it was not possible to use the standard datasets used in the analyzed papers, since they are datasets with objects and conditions very different from those in which the AUVs operate. Hence, we exploited an unpaired image-to-image translation network is employed to bridge the gap between the rendered and the real images, producing photorealistic synthetic training images. Promising preliminary results confirm the goodness of the made choices.In this thesis we aim to address the image based 6-DoF pose estimation problem, or 3D pose estimation problem, for Autonomous Underwater Vehicles (AUVs). The results of the object pose estimation will be used, for example, to estimate the global location of the AUV or to approach more accurately the underwater infrastructures. Actually, an autonomous robot or a team of autonomous robots need accurate location skills to safely and effectively move within an underwater environment, where communications are sparse and unreliable, and to accomplish high-level tasks such as: underwater exploration, mapping of the surrounding environment, multi-robot conveyance and many other multi-robot problems.
Several state-of-the-art approaches will be analysed and tested on real datasets.
Collecting underwater images and providing them with an accurate ground-truth estimate of the object's pose is an expansive and extremely time-consuming activity
To this end, we addressed the problem using only synthetic datasets. In fact, it was not possible to use the standard datasets used in the analyzed papers, since they are datasets with objects and conditions very different from those in which the AUVs operate. Hence, we exploited an unpaired image-to-image translation network is employed to bridge the gap between the rendered and the real images, producing photorealistic synthetic training images. Promising preliminary results confirm the goodness of the made choices
Basic level scene understanding: categories, attributes and structures
A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the richly annotated SUN database which is a collection of annotated images spanning 908 different scene categories with object, attribute, and geometric labels for many scenes. This database allows us to systematically study the space of scenes and to establish a benchmark for scene and object recognition. We augment the categorical SUN database with 102 scene attributes for every image and explore attribute recognition. Finally, we present an integrated system to extract the 3D structure of the scene and objects depicted in an image.Google U.S./Canada Ph.D. Fellowship in Computer VisionNational Science Foundation (U.S.) (grant 1016862)Google Faculty Research AwardNational Science Foundation (U.S.) (Career Award 1149853)National Science Foundation (U.S.) (Career Award 0747120)United States. Office of Naval Research. Multidisciplinary University Research Initiative (N000141010933
Deep neural networks for marine debris detection in sonar images
Garbage and waste disposal is one of the biggest challenges currently faced by mankind. Proper waste disposal and recycling is a must in any sustainable community, and in many coastal areas there is significant water pollution in the form of floating or submerged garbage. This is called marine debris. It is estimated that 6.4 million tonnes of marine debris enter water environments every year [McIlgorm et al. 2008, APEC Marine Resource Conservation WG], with 8 million items entering each day. An unknown fraction of this sinks to the bottom of water bodies. Submerged marine debris threatens marine life, and for shallow coastal areas, it can also threaten fishing vessels [Iñiguez et al. 2016, Renewable and Sustainable Energy Reviews]. Submerged marine debris typically stays in the environment for a long time (20+ years), and consists of materials that can be recycled, such as metals, plastics, glass, etc. Many of these items should not be disposed in water bodies as this has a negative effect in the environment and human health. Encouraged by the advances in Computer Vision from the use Deep Learning, we propose the use of Deep Neural Networks (DNNs) to survey and detect marine debris in the bottom of water bodies (seafloor, lake and river beds) from Forward-Looking Sonar (FLS) images. This thesis performs a comprehensive evaluation on the use of DNNs for the problem of marine debris detection in FLS images, as well as related problems such as image classification, matching, and detection proposals. We do this in a dataset of 2069 FLS images that we captured with an ARIS Explorer 3000 sensor on marine debris objects lying in the floor of a small water tank. We had issues with the sensor in a real world underwater environment that motivated the use of a water tank. The objects we used to produce this dataset contain typical household marine debris and distractor marine objects (tires, hooks, valves, etc), divided in 10 classes plus a background class. Our results show that for the evaluated tasks, DNNs area superior technique than the corresponding state of the art. There are large gains particularly for the matching and detection proposal tasks. We also study the effect of sample complexity and object size in many tasks, which is valuable information for practitioners. We expect that our results will advance the objective of using Autonomous Underwater Vehicles to automatically survey, detect and collect marine debris from underwater environments
Online Self-Supervised Thermal Water Segmentation for Aerial Vehicles
We present a new method to adapt an RGB-trained water segmentation network to
target-domain aerial thermal imagery using online self-supervision by
leveraging texture and motion cues as supervisory signals. This new thermal
capability enables current autonomous aerial robots operating in near-shore
environments to perform tasks such as visual navigation, bathymetry, and flow
tracking at night. Our method overcomes the problem of scarce and
difficult-to-obtain near-shore thermal data that prevents the application of
conventional supervised and unsupervised methods. In this work, we curate the
first aerial thermal near-shore dataset, show that our approach outperforms
fully-supervised segmentation models trained on limited target-domain thermal
data, and demonstrate real-time capabilities onboard an Nvidia Jetson embedded
computing platform. Code and datasets used in this work will be available at:
https://github.com/connorlee77/uav-thermal-water-segmentation.Comment: 8 pages, 4 figures, 3 table
Recommended from our members
Damage detection and monitoring for tunnel inspection based on computer vision
The deterioration of the underground infrastructure of the major cities around the world, due to ageing, has become a topic of great concern among engineers. Visual inspection, as part of the routine maintenance procedures, is a common practice used in the condition assessment of infrastructure to ensure its safety and serviceability. This practice, however, is labour-intensive, costly and inaccurate and, therefore, a new system based on computer vision technology is presented in this thesis, aiming to tackle these inadequacies.
This thesis proposes a novel mosaicing system for inspection reporting, which can create an almost distortion-free mosaic of tunnels, thus allowing a large area of tunnels to be visualised. The system relies on Structure from Motion (SFM), which enables the system to cope with images with a general camera motion, in contrast to standard mosaicing software that can cope only with a strict camera motion. The system involves the automatic robust estimation of a 3D cylindrical surface using a Support Vector Machine to classify 3D points to improve the accuracy of the estimation. It is shown that some curvatures are observed in the mosaics when an inaccurate surface is used for mosaicing, while the mosaics from a surface estimated using the proposed method are almost distortion-free.
New feature matching algorithms aiming to improve the performance of SFM systems are proposed. These algorithms apply a spatial consistency constraint to match features with a similar topography, in contrast to other matching algorithms that rely on matching based on the similar appearance of local image patches. The Shape Context and Random Forest algorithms are combined in the proposed algorithm, revealing promising results.
The final contribution is a new change detection system for monitoring cracks in multi-temporal images. The system can cope with images with a general camera motion achieved by geometrical registration using SFM, unlike other systems that assume fixed or controlled cameras. The system performs photometric normalisation to cope with illumination variation in the images, and also a motion-invariant change detection algorithm is applied to handle deformable objects. It is shown that the results from the proposed change detection system are still impractical for use with tunnel images from a real environment, and further study is required
Distributed scene reconstruction from multiple mobile platforms
Recent research on mobile robotics has produced new designs that provide
house-hold robots with omnidirectional motion. The image sensor embedded
in these devices motivates the application of 3D vision techniques on them
for navigation and mapping purposes. In addition to this, distributed cheapsensing
systems acting as unitary entity have recently been discovered as an
efficient alternative to expensive mobile equipment.
In this work we present an implementation of a visual reconstruction method,
structure from motion (SfM), on a low-budget, omnidirectional mobile platform,
and extend this method to distributed 3D scene reconstruction with
several instances of such a platform.
Our approach overcomes the challenges yielded by the plaform. The unprecedented
levels of noise produced by the image compression typical of
the platform is processed by our feature filtering methods, which ensure
suitable feature matching populations for epipolar geometry estimation by
means of a strict quality-based feature selection. The robust pose estimation
algorithms implemented, along with a novel feature tracking system,
enable our incremental SfM approach to novelly deal with ill-conditioned
inter-image configurations provoked by the omnidirectional motion. The
feature tracking system developed efficiently manages the feature scarcity
produced by noise and outputs quality feature tracks, which allow robust
3D mapping of a given scene even if - due to noise - their length is shorter
than what it is usually assumed for performing stable 3D reconstructions.
The distributed reconstruction from multiple instances of SfM is attained
by applying loop-closing techniques. Our multiple reconstruction system
merges individual 3D structures and resolves the global scale problem with
minimal overlaps, whereas in the literature 3D mapping is obtained by overlapping
stretches of sequences. The performance of this system is demonstrated
in the 2-session case.
The management of noise, the stability against ill-configurations and the
robustness of our SfM system is validated on a number of experiments and
compared with state-of-the-art approaches. Possible future research areas
are also discussed
Photogrammetric suite to manage the survey workflow in challenging environments and conditions
The present work is intended in providing new and innovative instruments to support the photogrammetric survey workflow during all its phases. A suite of tools has been conceived in order to manage the planning, the acquisition, the post-processing and the restitution steps, with particular attention to the rigorousness of the approach and to the final precision.
The main focus of the research has been the implementation of the tool MAGO, standing for Adaptive Mesh for Orthophoto Generation. Its novelty consists in the possibility to automatically reconstruct \u201cunrolled\u201d orthophotos of adjacent fa\ue7ades of a building using the point cloud, instead of the mesh, as input source for the orthophoto reconstruction.
The second tool has been conceived as a photogrammetric procedure based on Bundle Block Adjustment. The same issue is analysed from two mirrored perspectives: on the one hand, the use of moving cameras in a static scenario in order to manage real-time indoor navigation; on the other hand, the use of static cameras in a moving scenario in order to achieve the simultaneously reconstruction of the 3D model of the changing object.
A third tool named U.Ph.O., standing for Unmanned Photogrammetric Office, has been integrated with a new module. The general aim is on the one hand to plan the photogrammetric survey considering the expected precision, computed on the basis of a network simulation, and on the other hand to check if the achieved survey has been collected compatibly with the planned conditions. The provided integration concerns the treatment of surfaces with a generic orientation further than the ones with a planimetric development.
After a brief introduction, a general description about the photogrammetric principles is given in the first chapter of the dissertation; a chapter follows about the parallelism between Photogrammetry and Computer Vision and the contribution of this last in the development of the described tools. The third chapter specifically regards, indeed, the implemented software and tools, while the fourth contains the training test and the validation. Finally, conclusions and future perspectives are reported
- …