242 research outputs found
Optimal dictionary learning with application to underwater target detection from synthetic aperture sonar imagery
2014 Spring.Includes bibliographical references.K-SVD is a relatively new method used to create a dictionary matrix that best ts a set of training data vectors formed with the intent of using it for sparse representation of a data vector. K-SVD is flexible in that it can be used in conjunction with any preferred pursuit method of sparse coding including the orthogonal matching pursuit (OMP) method considered in this thesis. Using adaptive lter theory, a new fast OMP method has been proposed to reduce the computational time of the sparse pursuit phase of K-SVD as well as during on-line implementation without sacrificing the accuracy of the sparse pursuit method. Due to the matrix inversion required in the standard OMP, the amount of time required to sparsely represent a signal grows quickly as the sparsity restriction is relaxed. The speed up in the proposed method was accomplished by replacing this computationally demanding matrix inversion with a series of recursive "time-order" update equations by using orthogonal projection updating used in adaptive filter theory. The geometric perspective of this new learning is also provided. Additionally, a recursive method for faster dictionary learning is also discussed which can be used instead of the singular value decomposition (SVD) process in the K-SVD method. A significant bottleneck in K-SVD is the computation of the SVD of the reduced error matrix during the update of each dictionary atom. The SVD operation is replaced with an efficient recursive update which will allow limited in-situ learning to update dictionaries as the system is exposed to new signals. Further, structured data formatting has allowed a multi-channel extension of K-SVD to merge multiple data sources into a single dictionary capable of creating a single sparse vector representing a variety of multi-channel data. Another contribution of this work is the application of the developed methods to an underwater target detection problem using coregistered dual-channel (namely broadband and high-frequency) side-scan sonar imagery data. Here, K-SVD is used to create a more optimal dictionary in the sense of reconstructing target and non-target image snippets using their respective dictionaries. The ratio of the reconstruction errors is used as a likelihood ratio for target detection. The proposed methods were then applied and benchmarked against other detection methods for detecting mine-like objects from two dual-channel sonar datasets. Comparison of the results in terms of receiver operating characteristic (ROC) curve indicates that the dual-channel K-SVD based detector provides a detection rate of PD = 99% and false alarms rate of PFA = 1% on the first dataset, and PD = 95% and PFA = 5% on the second dataset at the knee point of the ROC. The single-channel K-SVD based detector on the other hand, provides PD = 96% and PFA = 4% on the first dataset, and PD = 96% and PFA = 4% on the second dataset at the knee point of the ROC. The degradation in performance for the second dataset is attributed to the fact that the system was trained on a limited number of samples from the first dataset. The coherence-based detector provides PD = 87% and PFA = 13% on the first dataset and PD = 86% and PFA = 14% on the second dataset. These results show excellent performance of the proposed dictionary learning and sparse coding methods for underwater target detection using both dual-channel sonar imagery
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Semi-supervised multiscale dual-encoding method for faulty traffic data detection
Inspired by the recent success of deep learning in multiscale information
encoding, we introduce a variational autoencoder (VAE) based semi-supervised
method for detection of faulty traffic data, which is cast as a classification
problem. Continuous wavelet transform (CWT) is applied to the time series of
traffic volume data to obtain rich features embodied in time-frequency
representation, followed by a twin of VAE models to separately encode normal
data and faulty data. The resulting multiscale dual encodings are concatenated
and fed to an attention-based classifier, consisting of a self-attention module
and a multilayer perceptron. For comparison, the proposed architecture is
evaluated against five different encoding schemes, including (1) VAE with only
normal data encoding, (2) VAE with only faulty data encoding, (3) VAE with both
normal and faulty data encodings, but without attention module in the
classifier, (4) siamese encoding, and (5) cross-vision transformer (CViT)
encoding. The first four encoding schemes adopted the same convolutional neural
network (CNN) architecture while the fifth encoding scheme follows the
transformer architecture of CViT. Our experiments show that the proposed
architecture with the dual encoding scheme, coupled with attention module,
outperforms other encoding schemes and results in classification accuracy of
96.4%, precision of 95.5%, and recall of 97.7%.Comment: 16 pages, 8 figure
Anomaly detection & object classification using multi-spectral LiDAR and sonar
In this thesis, we present the theory of high-dimensional signal approximation of multifrequency signals. We also present both linear and non-linear compressive sensing (CS)
algorithms that generate encoded representations of time-correlated single photon counting (TCSPC) light detection and ranging (LiDAR) data, side-scan sonar (SSS) and synthetic aperture sonar (SAS). The main contributions of this thesis are summarised as
follows:
1. Research is carried out studying full-waveform (FW) LiDARs, in particular, the
TCSPC data, capture, storage and processing.
2. FW-LiDARs are capable of capturing large quantities of photon-counting data in
real-time. However, the real-time processing of the raw LiDAR waveforms hasn’t
been widely exploited. This thesis answers some of the fundamental questions:
• can semantic information be extracted and encoded from raw multi-spectral
FW-LiDAR signals?
• can these encoded representations then be used for object segmentation and
classification?
3. Research is carried out into signal approximation and compressive sensing techniques, its limitations and the application domains.
4. Research is also carried out in 3D point cloud processing, combining geometric features with material spectra (spectral-depth representation), for object segmentation
and classification.
5. Extensive experiments have been carried out with publicly available datasets, e.g.
the Washington RGB Image and Depth (RGB-D) dataset [108], YaleB face dataset1
[110], real-world multi-frequency aerial laser scans (ALS)2 and an underwater multifrequency (16 wavelengths) TCSPC dataset collected using custom-build targets
especially for this thesis.
6. The multi-spectral measurements were made underwater on targets with different shapes and materials. A novel spectral-depth representation is presented with
strong discrimination characteristics on target signatures. Several custom-made
and realistically scaled exemplars with known and unknown targets have been investigated using a multi-spectral single photon counting LiDAR system.
7. In this work, we also present a new approach to peak modelling and classification
for waveform enabled LiDAR systems. Not all existing approaches perform peak
modelling and classification simultaneously in real-time. This was tested on both
simulated waveform enabled LiDAR data and real ALS data2
.
This PhD also led to an industrial secondment at Carbomap, Edinburgh, where some of
the waveform modelling algorithms were implemented in C++ and CUDA for Nvidia TX1
boards for real-time performance.
1http://vision.ucsd.edu/~leekc/ExtYaleDatabase/
2This dataset was captured in collaboration with Carbomap Ltd. Edinburgh, UK. The data was
collected during one of the trials in Austria using commercial-off-the-shelf (COTS) sensors
- …