242 research outputs found

    Optimal dictionary learning with application to underwater target detection from synthetic aperture sonar imagery

    Get PDF
    2014 Spring.Includes bibliographical references.K-SVD is a relatively new method used to create a dictionary matrix that best ts a set of training data vectors formed with the intent of using it for sparse representation of a data vector. K-SVD is flexible in that it can be used in conjunction with any preferred pursuit method of sparse coding including the orthogonal matching pursuit (OMP) method considered in this thesis. Using adaptive lter theory, a new fast OMP method has been proposed to reduce the computational time of the sparse pursuit phase of K-SVD as well as during on-line implementation without sacrificing the accuracy of the sparse pursuit method. Due to the matrix inversion required in the standard OMP, the amount of time required to sparsely represent a signal grows quickly as the sparsity restriction is relaxed. The speed up in the proposed method was accomplished by replacing this computationally demanding matrix inversion with a series of recursive "time-order" update equations by using orthogonal projection updating used in adaptive filter theory. The geometric perspective of this new learning is also provided. Additionally, a recursive method for faster dictionary learning is also discussed which can be used instead of the singular value decomposition (SVD) process in the K-SVD method. A significant bottleneck in K-SVD is the computation of the SVD of the reduced error matrix during the update of each dictionary atom. The SVD operation is replaced with an efficient recursive update which will allow limited in-situ learning to update dictionaries as the system is exposed to new signals. Further, structured data formatting has allowed a multi-channel extension of K-SVD to merge multiple data sources into a single dictionary capable of creating a single sparse vector representing a variety of multi-channel data. Another contribution of this work is the application of the developed methods to an underwater target detection problem using coregistered dual-channel (namely broadband and high-frequency) side-scan sonar imagery data. Here, K-SVD is used to create a more optimal dictionary in the sense of reconstructing target and non-target image snippets using their respective dictionaries. The ratio of the reconstruction errors is used as a likelihood ratio for target detection. The proposed methods were then applied and benchmarked against other detection methods for detecting mine-like objects from two dual-channel sonar datasets. Comparison of the results in terms of receiver operating characteristic (ROC) curve indicates that the dual-channel K-SVD based detector provides a detection rate of PD = 99% and false alarms rate of PFA = 1% on the first dataset, and PD = 95% and PFA = 5% on the second dataset at the knee point of the ROC. The single-channel K-SVD based detector on the other hand, provides PD = 96% and PFA = 4% on the first dataset, and PD = 96% and PFA = 4% on the second dataset at the knee point of the ROC. The degradation in performance for the second dataset is attributed to the fact that the system was trained on a limited number of samples from the first dataset. The coherence-based detector provides PD = 87% and PFA = 13% on the first dataset and PD = 86% and PFA = 14% on the second dataset. These results show excellent performance of the proposed dictionary learning and sparse coding methods for underwater target detection using both dual-channel sonar imagery

    Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016)

    Get PDF

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Semi-supervised multiscale dual-encoding method for faulty traffic data detection

    Full text link
    Inspired by the recent success of deep learning in multiscale information encoding, we introduce a variational autoencoder (VAE) based semi-supervised method for detection of faulty traffic data, which is cast as a classification problem. Continuous wavelet transform (CWT) is applied to the time series of traffic volume data to obtain rich features embodied in time-frequency representation, followed by a twin of VAE models to separately encode normal data and faulty data. The resulting multiscale dual encodings are concatenated and fed to an attention-based classifier, consisting of a self-attention module and a multilayer perceptron. For comparison, the proposed architecture is evaluated against five different encoding schemes, including (1) VAE with only normal data encoding, (2) VAE with only faulty data encoding, (3) VAE with both normal and faulty data encodings, but without attention module in the classifier, (4) siamese encoding, and (5) cross-vision transformer (CViT) encoding. The first four encoding schemes adopted the same convolutional neural network (CNN) architecture while the fifth encoding scheme follows the transformer architecture of CViT. Our experiments show that the proposed architecture with the dual encoding scheme, coupled with attention module, outperforms other encoding schemes and results in classification accuracy of 96.4%, precision of 95.5%, and recall of 97.7%.Comment: 16 pages, 8 figure

    Anomaly detection & object classification using multi-spectral LiDAR and sonar

    Get PDF
    In this thesis, we present the theory of high-dimensional signal approximation of multifrequency signals. We also present both linear and non-linear compressive sensing (CS) algorithms that generate encoded representations of time-correlated single photon counting (TCSPC) light detection and ranging (LiDAR) data, side-scan sonar (SSS) and synthetic aperture sonar (SAS). The main contributions of this thesis are summarised as follows: 1. Research is carried out studying full-waveform (FW) LiDARs, in particular, the TCSPC data, capture, storage and processing. 2. FW-LiDARs are capable of capturing large quantities of photon-counting data in real-time. However, the real-time processing of the raw LiDAR waveforms hasn’t been widely exploited. This thesis answers some of the fundamental questions: • can semantic information be extracted and encoded from raw multi-spectral FW-LiDAR signals? • can these encoded representations then be used for object segmentation and classification? 3. Research is carried out into signal approximation and compressive sensing techniques, its limitations and the application domains. 4. Research is also carried out in 3D point cloud processing, combining geometric features with material spectra (spectral-depth representation), for object segmentation and classification. 5. Extensive experiments have been carried out with publicly available datasets, e.g. the Washington RGB Image and Depth (RGB-D) dataset [108], YaleB face dataset1 [110], real-world multi-frequency aerial laser scans (ALS)2 and an underwater multifrequency (16 wavelengths) TCSPC dataset collected using custom-build targets especially for this thesis. 6. The multi-spectral measurements were made underwater on targets with different shapes and materials. A novel spectral-depth representation is presented with strong discrimination characteristics on target signatures. Several custom-made and realistically scaled exemplars with known and unknown targets have been investigated using a multi-spectral single photon counting LiDAR system. 7. In this work, we also present a new approach to peak modelling and classification for waveform enabled LiDAR systems. Not all existing approaches perform peak modelling and classification simultaneously in real-time. This was tested on both simulated waveform enabled LiDAR data and real ALS data2 . This PhD also led to an industrial secondment at Carbomap, Edinburgh, where some of the waveform modelling algorithms were implemented in C++ and CUDA for Nvidia TX1 boards for real-time performance. 1http://vision.ucsd.edu/~leekc/ExtYaleDatabase/ 2This dataset was captured in collaboration with Carbomap Ltd. Edinburgh, UK. The data was collected during one of the trials in Austria using commercial-off-the-shelf (COTS) sensors
    • …