791 research outputs found
Structured Dictionary Learning for Classification
Sparsity driven signal processing has gained tremendous popularity in the
last decade. At its core, the assumption is that the signal of interest is
sparse with respect to either a fixed transformation or a signal dependent
dictionary. To better capture the data characteristics, various dictionary
learning methods have been proposed for both reconstruction and classification
tasks. For classification particularly, most approaches proposed so far have
focused on designing explicit constraints on the sparse code to improve
classification accuracy while simply adopting -norm or -norm for
sparsity regularization. Motivated by the success of structured sparsity in the
area of Compressed Sensing, we propose a structured dictionary learning
framework (StructDL) that incorporates the structure information on both group
and task levels in the learning process. Its benefits are two-fold: (i) the
label consistency between dictionary atoms and training data are implicitly
enforced; and (ii) the classification performance is more robust in the cases
of a small dictionary size or limited training data than other techniques.
Using the subspace model, we derive the conditions for StructDL to guarantee
the performance and show theoretically that StructDL is superior to -norm
or -norm regularized dictionary learning for classification. Extensive
experiments have been performed on both synthetic simulations and real world
applications, such as face recognition and object classification, to
demonstrate the validity of the proposed DL framework
Supervised Dictionary Learning and Sparse Representation-A Review
Dictionary learning and sparse representation (DLSR) is a recent and
successful mathematical model for data representation that achieves
state-of-the-art performance in various fields such as pattern recognition,
machine learning, computer vision, and medical imaging. The original
formulation for DLSR is based on the minimization of the reconstruction error
between the original signal and its sparse representation in the space of the
learned dictionary. Although this formulation is optimal for solving problems
such as denoising, inpainting, and coding, it may not lead to optimal solution
in classification tasks, where the ultimate goal is to make the learned
dictionary and corresponding sparse representation as discriminative as
possible. This motivated the emergence of a new category of techniques, which
is appropriately called supervised dictionary learning and sparse
representation (S-DLSR), leading to more optimal dictionary and sparse
representation in classification tasks. Despite many research efforts for
S-DLSR, the literature lacks a comprehensive view of these techniques, their
connections, advantages and shortcomings. In this paper, we address this gap
and provide a review of the recently proposed algorithms for S-DLSR. We first
present a taxonomy of these algorithms into six categories based on the
approach taken to include label information into the learning of the dictionary
and/or sparse representation. For each category, we draw connections between
the algorithms in this category and present a unified framework for them. We
then provide guidelines for applied researchers on how to represent and learn
the building blocks of an S-DLSR solution based on the problem at hand. This
review provides a broad, yet deep, view of the state-of-the-art methods for
S-DLSR and allows for the advancement of research and development in this
emerging area of research
Learning Discriminative Multilevel Structured Dictionaries for Supervised Image Classification
Sparse representations using overcomplete dictionaries have proved to be a
powerful tool in many signal processing applications such as denoising,
super-resolution, inpainting, compression or classification. The sparsity of
the representation very much depends on how well the dictionary is adapted to
the data at hand. In this paper, we propose a method for learning structured
multilevel dictionaries with discriminative constraints to make them well
suited for the supervised pixelwise classification of images. A multilevel
tree-structured discriminative dictionary is learnt for each class, with a
learning objective concerning the reconstruction errors of the image patches
around the pixels over each class-representative dictionary. After the initial
assignment of the class labels to image pixels based on their sparse
representations over the learnt dictionaries, the final classification is
achieved by smoothing the label image with a graph cut method and an erosion
method. Applied to a common set of texture images, our supervised
classification method shows competitive results with the state of the art
Sparse Dictionary-based Attributes for Action Recognition and Summarization
We present an approach for dictionary learning of action attributes via
information maximization. We unify the class distribution and appearance
information into an objective function for learning a sparse dictionary of
action attributes. The objective function maximizes the mutual information
between what has been learned and what remains to be learned in terms of
appearance information and class distribution for each dictionary atom. We
propose a Gaussian Process (GP) model for sparse representation to optimize the
dictionary objective function. The sparse coding property allows a kernel with
compact support in GP to realize a very efficient dictionary learning process.
Hence we can describe an action video by a set of compact and discriminative
action attributes. More importantly, we can recognize modeled action categories
in a sparse feature space, which can be generalized to unseen and unmodeled
action categories. Experimental results demonstrate the effectiveness of our
approach in action recognition and summarization
Discriminative Bayesian Dictionary Learning for Classification
We propose a Bayesian approach to learn discriminative dictionaries for
sparse representation of data. The proposed approach infers probability
distributions over the atoms of a discriminative dictionary using a Beta
Process. It also computes sets of Bernoulli distributions that associate class
labels to the learned dictionary atoms. This association signifies the
selection probabilities of the dictionary atoms in the expansion of
class-specific data. Furthermore, the non-parametric character of the proposed
approach allows it to infer the correct size of the dictionary. We exploit the
aforementioned Bernoulli distributions in separately learning a linear
classifier. The classifier uses the same hierarchical Bayesian model as the
dictionary, which we present along the analytical inference solution for Gibbs
sampling. For classification, a test instance is first sparsely encoded over
the learned dictionary and the codes are fed to the classifier. We performed
experiments for face and action recognition; and object and scene-category
classification using five public datasets and compared the results with
state-of-the-art discriminative sparse representation approaches. Experiments
show that the proposed Bayesian approach consistently outperforms the existing
approaches.Comment: 15 page
Greedy Deep Dictionary Learning
In this work we propose a new deep learning tool called deep dictionary
learning. Multi-level dictionaries are learnt in a greedy fashion, one layer at
a time. This requires solving a simple (shallow) dictionary learning problem,
the solution to this is well known. We apply the proposed technique on some
benchmark deep learning datasets. We compare our results with other deep
learning tools like stacked autoencoder and deep belief network; and state of
the art supervised dictionary learning tools like discriminative KSVD and label
consistent KSVD. Our method yields better results than all
Discriminative Nonlinear Analysis Operator Learning: When Cosparse Model Meets Image Classification
Linear synthesis model based dictionary learning framework has achieved
remarkable performances in image classification in the last decade. Behaved as
a generative feature model, it however suffers from some intrinsic
deficiencies. In this paper, we propose a novel parametric nonlinear analysis
cosparse model (NACM) with which a unique feature vector will be much more
efficiently extracted. Additionally, we derive a deep insight to demonstrate
that NACM is capable of simultaneously learning the task adapted feature
transformation and regularization to encode our preferences, domain prior
knowledge and task oriented supervised information into the features. The
proposed NACM is devoted to the classification task as a discriminative feature
model and yield a novel discriminative nonlinear analysis operator learning
framework (DNAOL). The theoretical analysis and experimental performances
clearly demonstrate that DNAOL will not only achieve the better or at least
competitive classification accuracies than the state-of-the-art algorithms but
it can also dramatically reduce the time complexities in both training and
testing phases.Comment: IEEE TIP Accepte
Jointly Learning Non-negative Projection and Dictionary with Discriminative Graph Constraints for Classification
Sparse coding with dictionary learning (DL) has shown excellent
classification performance. Despite the considerable number of existing works,
how to obtain features on top of which dictionaries can be better learned
remains an open and interesting question. Many current prevailing DL methods
directly adopt well-performing crafted features. While such strategy may
empirically work well, it ignores certain intrinsic relationship between
dictionaries and features. We propose a framework where features and
dictionaries are jointly learned and optimized. The framework, named joint
non-negative projection and dictionary learning (JNPDL), enables interaction
between the input features and the dictionaries. The non-negative projection
leads to discriminative parts-based object features while DL seeks a more
suitable representation. Discriminative graph constraints are further imposed
to simultaneously maximize intra-class compactness and inter-class
separability. Experiments on both image and image set classification show the
excellent performance of JNPDL by outperforming several state-of-the-art
approaches.Comment: To appear in BMVC 201
Scalable Block-Diagonal Locality-Constrained Projective Dictionary Learning
We propose a novel structured discriminative block-diagonal dictionary
learning method, referred to as scalable Locality-Constrained Projective
Dictionary Learning (LC-PDL), for efficient representation and classification.
To improve the scalability by saving both training and testing time, our LC-PDL
aims at learning a structured discriminative dictionary and a block-diagonal
representation without using costly l0/l1-norm. Besides, it avoids extra
time-consuming sparse reconstruction process with the well-trained dictionary
for new sample as many existing models. More importantly, LC-PDL avoids using
the complementary data matrix to learn the sub-dictionary over each class. To
enhance the performance, we incorporate a locality constraint of atoms into the
DL procedures to keep local information and obtain the codes of samples over
each class separately. A block-diagonal discriminative approximation term is
also derived to learn a discriminative projection to bridge data with their
codes by extracting the special block-diagonal features from data, which can
ensure the approximate coefficients to associate with its label information
clearly. Then, a robust multiclass classifier is trained over extracted
block-diagonal codes for accurate label predictions. Experimental results
verify the effectiveness of our algorithm.Comment: Accepted at the 28th International Joint Conference on Artificial
Intelligence(IJCAI 2019
When Dictionary Learning Meets Deep Learning: Deep Dictionary Learning and Coding Network for Image Recognition with Limited Data
We present a new Deep Dictionary Learning and Coding Network (DDLCN) for
image recognition tasks with limited data. The proposed DDLCN has most of the
standard deep learning layers (e.g., input/output, pooling, fully connected,
etc.), but the fundamental convolutional layers are replaced by our proposed
compound dictionary learning and coding layers. The dictionary learning learns
an over-complete dictionary for input training data. At the deep coding layer,
a locality constraint is added to guarantee that the activated dictionary bases
are close to each other. Then the activated dictionary atoms are assembled and
passed to the compound dictionary learning and coding layers. In this way, the
activated atoms in the first layer can be represented by the deeper atoms in
the second dictionary. Intuitively, the second dictionary is designed to learn
the fine-grained components shared among the input dictionary atoms, thus a
more informative and discriminative low-level representation of the dictionary
atoms can be obtained. We empirically compare DDLCN with several leading
dictionary learning methods and deep learning models. Experimental results on
five popular datasets show that DDLCN achieves competitive results compared
with state-of-the-art methods when the training data is limited. Code is
available at https://github.com/Ha0Tang/DDLCN.Comment: Accepted to TNNLS, an extended version of a paper published in
WACV2019. arXiv admin note: substantial text overlap with arXiv:1809.0418
- …