10,136 research outputs found
Optimal dictionary learning with application to underwater target detection from synthetic aperture sonar imagery
2014 Spring.Includes bibliographical references.K-SVD is a relatively new method used to create a dictionary matrix that best ts a set of training data vectors formed with the intent of using it for sparse representation of a data vector. K-SVD is flexible in that it can be used in conjunction with any preferred pursuit method of sparse coding including the orthogonal matching pursuit (OMP) method considered in this thesis. Using adaptive lter theory, a new fast OMP method has been proposed to reduce the computational time of the sparse pursuit phase of K-SVD as well as during on-line implementation without sacrificing the accuracy of the sparse pursuit method. Due to the matrix inversion required in the standard OMP, the amount of time required to sparsely represent a signal grows quickly as the sparsity restriction is relaxed. The speed up in the proposed method was accomplished by replacing this computationally demanding matrix inversion with a series of recursive "time-order" update equations by using orthogonal projection updating used in adaptive filter theory. The geometric perspective of this new learning is also provided. Additionally, a recursive method for faster dictionary learning is also discussed which can be used instead of the singular value decomposition (SVD) process in the K-SVD method. A significant bottleneck in K-SVD is the computation of the SVD of the reduced error matrix during the update of each dictionary atom. The SVD operation is replaced with an efficient recursive update which will allow limited in-situ learning to update dictionaries as the system is exposed to new signals. Further, structured data formatting has allowed a multi-channel extension of K-SVD to merge multiple data sources into a single dictionary capable of creating a single sparse vector representing a variety of multi-channel data. Another contribution of this work is the application of the developed methods to an underwater target detection problem using coregistered dual-channel (namely broadband and high-frequency) side-scan sonar imagery data. Here, K-SVD is used to create a more optimal dictionary in the sense of reconstructing target and non-target image snippets using their respective dictionaries. The ratio of the reconstruction errors is used as a likelihood ratio for target detection. The proposed methods were then applied and benchmarked against other detection methods for detecting mine-like objects from two dual-channel sonar datasets. Comparison of the results in terms of receiver operating characteristic (ROC) curve indicates that the dual-channel K-SVD based detector provides a detection rate of PD = 99% and false alarms rate of PFA = 1% on the first dataset, and PD = 95% and PFA = 5% on the second dataset at the knee point of the ROC. The single-channel K-SVD based detector on the other hand, provides PD = 96% and PFA = 4% on the first dataset, and PD = 96% and PFA = 4% on the second dataset at the knee point of the ROC. The degradation in performance for the second dataset is attributed to the fact that the system was trained on a limited number of samples from the first dataset. The coherence-based detector provides PD = 87% and PFA = 13% on the first dataset and PD = 86% and PFA = 14% on the second dataset. These results show excellent performance of the proposed dictionary learning and sparse coding methods for underwater target detection using both dual-channel sonar imagery
Fast Low-rank Representation based Spatial Pyramid Matching for Image Classification
Spatial Pyramid Matching (SPM) and its variants have achieved a lot of
success in image classification. The main difference among them is their
encoding schemes. For example, ScSPM incorporates Sparse Code (SC) instead of
Vector Quantization (VQ) into the framework of SPM. Although the methods
achieve a higher recognition rate than the traditional SPM, they consume more
time to encode the local descriptors extracted from the image. In this paper,
we propose using Low Rank Representation (LRR) to encode the descriptors under
the framework of SPM. Different from SC, LRR considers the group effect among
data points instead of sparsity. Benefiting from this property, the proposed
method (i.e., LrrSPM) can offer a better performance. To further improve the
generalizability and robustness, we reformulate the rank-minimization problem
as a truncated projection problem. Extensive experimental studies show that
LrrSPM is more efficient than its counterparts (e.g., ScSPM) while achieving
competitive recognition rates on nine image data sets.Comment: accepted into knowledge based systems, 201
Deep Networks for Image Super-Resolution with Sparse Prior
Deep learning techniques have been successfully applied in many areas of
computer vision, including low-level image restoration problems. For image
super-resolution, several models based on deep neural networks have been
recently proposed and attained superior performance that overshadows all
previous handcrafted models. The question then arises whether large-capacity
and data-driven models have become the dominant solution to the ill-posed
super-resolution problem. In this paper, we argue that domain expertise
represented by the conventional sparse coding model is still valuable, and it
can be combined with the key ingredients of deep learning to achieve further
improved results. We show that a sparse coding model particularly designed for
super-resolution can be incarnated as a neural network, and trained in a
cascaded structure from end to end. The interpretation of the network based on
sparse coding leads to much more efficient and effective training, as well as a
reduced model size. Our model is evaluated on a wide range of images, and shows
clear advantage over existing state-of-the-art methods in terms of both
restoration accuracy and human subjective quality
Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models
Conventional deep neural networks (DNN) for speech acoustic modeling rely on
Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary
class labels as the targets for DNN training. Subword classes in speech
recognition systems correspond to context-dependent tied states or senones. The
present work addresses some limitations of GMM-HMM senone alignments for DNN
training. We hypothesize that the senone probabilities obtained from a DNN
trained with binary labels can provide more accurate targets to learn better
acoustic models. However, DNN outputs bear inaccuracies which are exhibited as
high dimensional unstructured noise, whereas the informative components are
structured and low-dimensional. We exploit principle component analysis (PCA)
and sparse coding to characterize the senone subspaces. Enhanced probabilities
obtained from low-rank and sparse reconstructions are used as soft-targets for
DNN acoustic modeling, that also enables training with untranscribed data.
Experiments conducted on AMI corpus shows 4.6% relative reduction in word error
rate
- …