239 research outputs found
Survey on deep learning based computer vision for sonar imagery
Research on the automatic analysis of sonar images has focused on classical, i.e. non deep learning based, approaches for a long time. Over the past 15 years, however, the application of deep learning in this research field has constantly grown. This paper gives a broad overview of past and current research involving deep learning for feature extraction, classification, detection and segmentation of sidescan and synthetic aperture sonar imagery. Most research in this field has been directed towards the investigation of convolutional neural networks (CNN) for feature extraction and classification tasks, with the result that even small CNNs with up to four layers outperform conventional methods. The purpose of this work is twofold. On one hand, due to the quick development of deep learning it serves as an introduction for researchers, either just starting their work in this specific field or working on classical methods for the past years, and helps them to learn about the recent achievements. On the other hand, our main goal is to guide further research in this field by identifying main research gaps to bridge. We propose to leverage the research in this field by combining available data into an open source dataset as well as carrying out comparative studies on developed deep learning methods.Article number 10515711
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Coherent, super resolved radar beamforming using self-supervised learning
High resolution automotive radar sensors are required in order to meet the
high bar of autonomous vehicles needs and regulations. However, current radar
systems are limited in their angular resolution causing a technological gap. An
industry and academic trend to improve angular resolution by increasing the
number of physical channels, also increases system complexity, requires
sensitive calibration processes, lowers robustness to hardware malfunctions and
drives higher costs. We offer an alternative approach, named Radar signal
Reconstruction using Self Supervision (R2-S2), which significantly improves the
angular resolution of a given radar array without increasing the number of
physical channels. R2-S2 is a family of algorithms which use a Deep Neural
Network (DNN) with complex range-Doppler radar data as input and trained in a
self-supervised method using a loss function which operates in multiple data
representation spaces. Improvement of 4x in angular resolution was demonstrated
using a real-world dataset collected in urban and highway environments during
clear and rainy weather conditions.Comment: 28 pages 10 figure
Recommended from our members
Deep Perception Without a Camera: Enabling 3D Reconstruction and Object Recognition using Lidar and Sonar Sensing
Deep learning has recently revolutionized robot perception in many canonical robotic applications, such as autonomous driving. However, a similar transformation has yet to occur in more harsh environments including underwater and underground. This is due in part to the difficulty in deploying robots in these environments, which lack large real training datasets and often necessitate the use of non-traditional sensors for deep learning (e.g. imaging sonars and lidars). In this dissertation we demonstrate that by explicitly accounting for the sensor noise beget by challenging environments and by incorporating synthetic data in the training process, the power of deep learning can be leveraged for deployment in these harsh environments.
In our first contribution we develop a framework that enables the real-time 3D reconstruction of underwater environments using features from 2D sonar images. Due to noisy and low-resolution imagery as compared with standard cameras, accurate sonar image analysis necessitates the explicit consideration of noise. While deep learning by using Convolutional Neural Networks (CNNs) has been leveraged on sonar images, previous CNN-based methods do not explicitly consider the noise (from factors such as multi-pathing or irregular surfaces) often present in the images. In this contribution our key insight is to use atrous convolution, which has a larger field of context than standard convolution and is thus not misled as much by localized noise. We demonstrate that atrous convolution, as well as human-in-the-loop feature annotation, provides real-time reconstruction capability on datasets captured onboard our underwater vehicle while operating in a variety of environments.
In our second contribution we remove the human from the loop and develop an approach which leverages deep learning for a fully automated 3D underwater reconstruction algorithm using 2D sonar images as input. Our algorithm is able to produce accurate estimates even when common physical models break down due to phenomena such as non-diffuse reflections. Inspired by our success in the previous contribution, we propose the utilization of CNNs as a powerful method to extract meaningful information without being misled by noisy data. To ensure training convergence, we also introduce a self-supervised method that uses the physics of the sonar sensor to train the network on real data without ground-truth information for training. Our method can produce accurate 3D estimates given only a single image. We demonstrate that our method produces 3D reconstructions with an 80\% reduction in Root Mean Square Error compared to previous approaches, both in simulation and on real data.
We then extend this approach to leverage the series of images the robot collects as it moves through the environment. Specifically, we develop two CNNs that take as input multiple images captured at different points in time and output a more accurate prediction than just using a single image as input. To our knowledge this is the first such multi-sonar-image CNN designed for the 3D underwater reconstruction task. We validate this extension on synthetic and real data and show up to a 5\% improvement over competing methods.
Finally, we develop an improved method for incorporating synthetic data into the training process. This takes our previous contribution a step further by more tightly coupling synthetic and real point cloud feature extraction. We develop an adversarial training technique, which along with the standard object detection loss provides a training signal that encourages similar feature extraction from both synthetic and real clouds. This brings the training process closer to the preferred scenario: where the synthetic point clouds contain features that are very similar to those found in the real scans. We validate our approach in the context of the data-limited DARPA Subterranean Challenge and demonstrate that our 3D adversarial training architecture improves 3D object detection performance by up to 15\% depending on the data representation
Novel Hybrid-Learning Algorithms for Improved Millimeter-Wave Imaging Systems
Increasing attention is being paid to millimeter-wave (mmWave), 30 GHz to 300
GHz, and terahertz (THz), 300 GHz to 10 THz, sensing applications including
security sensing, industrial packaging, medical imaging, and non-destructive
testing. Traditional methods for perception and imaging are challenged by novel
data-driven algorithms that offer improved resolution, localization, and
detection rates. Over the past decade, deep learning technology has garnered
substantial popularity, particularly in perception and computer vision
applications. Whereas conventional signal processing techniques are more easily
generalized to various applications, hybrid approaches where signal processing
and learning-based algorithms are interleaved pose a promising compromise
between performance and generalizability. Furthermore, such hybrid algorithms
improve model training by leveraging the known characteristics of radio
frequency (RF) waveforms, thus yielding more efficiently trained deep learning
algorithms and offering higher performance than conventional methods. This
dissertation introduces novel hybrid-learning algorithms for improved mmWave
imaging systems applicable to a host of problems in perception and sensing.
Various problem spaces are explored, including static and dynamic gesture
classification; precise hand localization for human computer interaction;
high-resolution near-field mmWave imaging using forward synthetic aperture
radar (SAR); SAR under irregular scanning geometries; mmWave image
super-resolution using deep neural network (DNN) and Vision Transformer (ViT)
architectures; and data-level multiband radar fusion using a novel
hybrid-learning architecture. Furthermore, we introduce several novel
approaches for deep learning model training and dataset synthesis.Comment: PhD Dissertation Submitted to UTD ECE Departmen
Deep neural networks for marine debris detection in sonar images
Garbage and waste disposal is one of the biggest challenges currently faced by mankind. Proper waste disposal and recycling is a must in any sustainable community, and in many coastal areas there is significant water pollution in the form of floating or submerged garbage. This is called marine debris. It is estimated that 6.4 million tonnes of marine debris enter water environments every year [McIlgorm et al. 2008, APEC Marine Resource Conservation WG], with 8 million items entering each day. An unknown fraction of this sinks to the bottom of water bodies. Submerged marine debris threatens marine life, and for shallow coastal areas, it can also threaten fishing vessels [Iñiguez et al. 2016, Renewable and Sustainable Energy Reviews]. Submerged marine debris typically stays in the environment for a long time (20+ years), and consists of materials that can be recycled, such as metals, plastics, glass, etc. Many of these items should not be disposed in water bodies as this has a negative effect in the environment and human health. Encouraged by the advances in Computer Vision from the use Deep Learning, we propose the use of Deep Neural Networks (DNNs) to survey and detect marine debris in the bottom of water bodies (seafloor, lake and river beds) from Forward-Looking Sonar (FLS) images. This thesis performs a comprehensive evaluation on the use of DNNs for the problem of marine debris detection in FLS images, as well as related problems such as image classification, matching, and detection proposals. We do this in a dataset of 2069 FLS images that we captured with an ARIS Explorer 3000 sensor on marine debris objects lying in the floor of a small water tank. We had issues with the sensor in a real world underwater environment that motivated the use of a water tank. The objects we used to produce this dataset contain typical household marine debris and distractor marine objects (tires, hooks, valves, etc), divided in 10 classes plus a background class. Our results show that for the evaluated tasks, DNNs area superior technique than the corresponding state of the art. There are large gains particularly for the matching and detection proposal tasks. We also study the effect of sample complexity and object size in many tasks, which is valuable information for practitioners. We expect that our results will advance the objective of using Autonomous Underwater Vehicles to automatically survey, detect and collect marine debris from underwater environments
Physics-Informed Computer Vision: A Review and Perspectives
Incorporation of physical information in machine learning frameworks are
opening and transforming many application domains. Here the learning process is
augmented through the induction of fundamental knowledge and governing physical
laws. In this work we explore their utility for computer vision tasks in
interpreting and understanding visual data. We present a systematic literature
review of formulation and approaches to computer vision tasks guided by
physical laws. We begin by decomposing the popular computer vision pipeline
into a taxonomy of stages and investigate approaches to incorporate governing
physical equations in each stage. Existing approaches in each task are analyzed
with regard to what governing physical processes are modeled, formulated and
how they are incorporated, i.e. modify data (observation bias), modify networks
(inductive bias), and modify losses (learning bias). The taxonomy offers a
unified view of the application of the physics-informed capability,
highlighting where physics-informed learning has been conducted and where the
gaps and opportunities are. Finally, we highlight open problems and challenges
to inform future research. While still in its early days, the study of
physics-informed computer vision has the promise to develop better computer
vision models that can improve physical plausibility, accuracy, data efficiency
and generalization in increasingly realistic applications
- …