10,136 research outputs found

    Optimal dictionary learning with application to underwater target detection from synthetic aperture sonar imagery

    Get PDF
    2014 Spring.Includes bibliographical references.K-SVD is a relatively new method used to create a dictionary matrix that best ts a set of training data vectors formed with the intent of using it for sparse representation of a data vector. K-SVD is flexible in that it can be used in conjunction with any preferred pursuit method of sparse coding including the orthogonal matching pursuit (OMP) method considered in this thesis. Using adaptive lter theory, a new fast OMP method has been proposed to reduce the computational time of the sparse pursuit phase of K-SVD as well as during on-line implementation without sacrificing the accuracy of the sparse pursuit method. Due to the matrix inversion required in the standard OMP, the amount of time required to sparsely represent a signal grows quickly as the sparsity restriction is relaxed. The speed up in the proposed method was accomplished by replacing this computationally demanding matrix inversion with a series of recursive "time-order" update equations by using orthogonal projection updating used in adaptive filter theory. The geometric perspective of this new learning is also provided. Additionally, a recursive method for faster dictionary learning is also discussed which can be used instead of the singular value decomposition (SVD) process in the K-SVD method. A significant bottleneck in K-SVD is the computation of the SVD of the reduced error matrix during the update of each dictionary atom. The SVD operation is replaced with an efficient recursive update which will allow limited in-situ learning to update dictionaries as the system is exposed to new signals. Further, structured data formatting has allowed a multi-channel extension of K-SVD to merge multiple data sources into a single dictionary capable of creating a single sparse vector representing a variety of multi-channel data. Another contribution of this work is the application of the developed methods to an underwater target detection problem using coregistered dual-channel (namely broadband and high-frequency) side-scan sonar imagery data. Here, K-SVD is used to create a more optimal dictionary in the sense of reconstructing target and non-target image snippets using their respective dictionaries. The ratio of the reconstruction errors is used as a likelihood ratio for target detection. The proposed methods were then applied and benchmarked against other detection methods for detecting mine-like objects from two dual-channel sonar datasets. Comparison of the results in terms of receiver operating characteristic (ROC) curve indicates that the dual-channel K-SVD based detector provides a detection rate of PD = 99% and false alarms rate of PFA = 1% on the first dataset, and PD = 95% and PFA = 5% on the second dataset at the knee point of the ROC. The single-channel K-SVD based detector on the other hand, provides PD = 96% and PFA = 4% on the first dataset, and PD = 96% and PFA = 4% on the second dataset at the knee point of the ROC. The degradation in performance for the second dataset is attributed to the fact that the system was trained on a limited number of samples from the first dataset. The coherence-based detector provides PD = 87% and PFA = 13% on the first dataset and PD = 86% and PFA = 14% on the second dataset. These results show excellent performance of the proposed dictionary learning and sparse coding methods for underwater target detection using both dual-channel sonar imagery

    Fast Low-rank Representation based Spatial Pyramid Matching for Image Classification

    Full text link
    Spatial Pyramid Matching (SPM) and its variants have achieved a lot of success in image classification. The main difference among them is their encoding schemes. For example, ScSPM incorporates Sparse Code (SC) instead of Vector Quantization (VQ) into the framework of SPM. Although the methods achieve a higher recognition rate than the traditional SPM, they consume more time to encode the local descriptors extracted from the image. In this paper, we propose using Low Rank Representation (LRR) to encode the descriptors under the framework of SPM. Different from SC, LRR considers the group effect among data points instead of sparsity. Benefiting from this property, the proposed method (i.e., LrrSPM) can offer a better performance. To further improve the generalizability and robustness, we reformulate the rank-minimization problem as a truncated projection problem. Extensive experimental studies show that LrrSPM is more efficient than its counterparts (e.g., ScSPM) while achieving competitive recognition rates on nine image data sets.Comment: accepted into knowledge based systems, 201

    Deep Networks for Image Super-Resolution with Sparse Prior

    Full text link
    Deep learning techniques have been successfully applied in many areas of computer vision, including low-level image restoration problems. For image super-resolution, several models based on deep neural networks have been recently proposed and attained superior performance that overshadows all previous handcrafted models. The question then arises whether large-capacity and data-driven models have become the dominant solution to the ill-posed super-resolution problem. In this paper, we argue that domain expertise represented by the conventional sparse coding model is still valuable, and it can be combined with the key ingredients of deep learning to achieve further improved results. We show that a sparse coding model particularly designed for super-resolution can be incarnated as a neural network, and trained in a cascaded structure from end to end. The interpretation of the network based on sparse coding leads to much more efficient and effective training, as well as a reduced model size. Our model is evaluated on a wide range of images, and shows clear advantage over existing state-of-the-art methods in terms of both restoration accuracy and human subjective quality

    Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models

    Full text link
    Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems correspond to context-dependent tied states or senones. The present work addresses some limitations of GMM-HMM senone alignments for DNN training. We hypothesize that the senone probabilities obtained from a DNN trained with binary labels can provide more accurate targets to learn better acoustic models. However, DNN outputs bear inaccuracies which are exhibited as high dimensional unstructured noise, whereas the informative components are structured and low-dimensional. We exploit principle component analysis (PCA) and sparse coding to characterize the senone subspaces. Enhanced probabilities obtained from low-rank and sparse reconstructions are used as soft-targets for DNN acoustic modeling, that also enables training with untranscribed data. Experiments conducted on AMI corpus shows 4.6% relative reduction in word error rate
    • …
    corecore