801 research outputs found
Scalable Block-Diagonal Locality-Constrained Projective Dictionary Learning
We propose a novel structured discriminative block-diagonal dictionary
learning method, referred to as scalable Locality-Constrained Projective
Dictionary Learning (LC-PDL), for efficient representation and classification.
To improve the scalability by saving both training and testing time, our LC-PDL
aims at learning a structured discriminative dictionary and a block-diagonal
representation without using costly l0/l1-norm. Besides, it avoids extra
time-consuming sparse reconstruction process with the well-trained dictionary
for new sample as many existing models. More importantly, LC-PDL avoids using
the complementary data matrix to learn the sub-dictionary over each class. To
enhance the performance, we incorporate a locality constraint of atoms into the
DL procedures to keep local information and obtain the codes of samples over
each class separately. A block-diagonal discriminative approximation term is
also derived to learn a discriminative projection to bridge data with their
codes by extracting the special block-diagonal features from data, which can
ensure the approximate coefficients to associate with its label information
clearly. Then, a robust multiclass classifier is trained over extracted
block-diagonal codes for accurate label predictions. Experimental results
verify the effectiveness of our algorithm.Comment: Accepted at the 28th International Joint Conference on Artificial
Intelligence(IJCAI 2019
Jointly Learning Structured Analysis Discriminative Dictionary and Analysis Multiclass Classifier
In this paper, we propose an analysis mechanism based structured Analysis
Discriminative Dictionary Learning (ADDL) framework. ADDL seamlessly integrates
the analysis discriminative dictionary learning, analysis representation and
analysis classifier training into a unified model. The applied analysis
mechanism can make sure that the learnt dictionaries, representations and
linear classifiers over different classes are independent and discriminating as
much as possible. The dictionary is obtained by minimizing a reconstruction
error and an analytical incoherence promoting term that encourages the
sub-dictionaries associated with different classes to be independent. To obtain
the representation coefficients, ADDL imposes a sparse l2,1-norm constraint on
the coding coefficients instead of using l0 or l1-norm, since the l0 or l1-norm
constraint applied in most existing DL criteria makes the training phase time
consuming. The codes-extraction projection that bridges data with the sparse
codes by extracting special features from the given samples is calculated via
minimizing a sparse codes approximation term. Then we compute a linear
classifier based on the approximated sparse codes by an analysis mechanism to
simultaneously consider the classification and representation powers. Thus, the
classification approach of our model is very efficient, because it can avoid
the extra time-consuming sparse reconstruction process with trained dictionary
for each new test data as most existing DL algorithms. Simulations on real
image databases demonstrate that our ADDL model can obtain superior performance
over other state-of-the-arts.Comment: Accepted by IEEE TNNL
Jointly Learning Non-negative Projection and Dictionary with Discriminative Graph Constraints for Classification
Sparse coding with dictionary learning (DL) has shown excellent
classification performance. Despite the considerable number of existing works,
how to obtain features on top of which dictionaries can be better learned
remains an open and interesting question. Many current prevailing DL methods
directly adopt well-performing crafted features. While such strategy may
empirically work well, it ignores certain intrinsic relationship between
dictionaries and features. We propose a framework where features and
dictionaries are jointly learned and optimized. The framework, named joint
non-negative projection and dictionary learning (JNPDL), enables interaction
between the input features and the dictionaries. The non-negative projection
leads to discriminative parts-based object features while DL seeks a more
suitable representation. Discriminative graph constraints are further imposed
to simultaneously maximize intra-class compactness and inter-class
separability. Experiments on both image and image set classification show the
excellent performance of JNPDL by outperforming several state-of-the-art
approaches.Comment: To appear in BMVC 201
Learning Structured Twin-Incoherent Twin-Projective Latent Dictionary Pairs for Classification
In this paper, we extend the popular dictionary pair learning (DPL) into the
scenario of twin-projective latent flexible DPL under a structured
twin-incoherence. Technically, a novel framework called Twin-Projective Latent
Flexible DPL (TP-DPL) is proposed, which minimizes the twin-incoherence
constrained flexibly-relaxed reconstruction error to avoid the possible
over-fitting issue and produce accurate reconstruction. In this setting, our
TP-DPL integrates the twin-incoherence based latent flexible DPL and the joint
embedding of codes as well as salient features by twin-projection into a
unified model in an adaptive neighborhood-preserving manner. As a result,
TP-DPL unifies the salient feature extraction, representation and
classification. The twin-incoherence constraint on codes and features can
explicitly ensure high intra-class compactness and inter-class separation over
them. TP-DPL also integrates the adaptive weighting to preserve the local
neighborhood of the coefficients and salient features within each class
explicitly. For efficiency, TP-DPL uses Frobenius-norm and abandons the costly
l0/l1-norm for group sparse representation. Another byproduct is that TP-DPL
can directly apply the class-specific twin-projective reconstruction residual
to compute the label of data. Extensive results on public databases show that
TP-DPL can deliver the state-of-the-art performance.Comment: Accepted by ICDM 2019 as a regular pape
Discriminative Local Sparse Representation by Robust Adaptive Dictionary Pair Learning
In this paper, we propose a structured Robust Adaptive Dic-tionary Pair
Learning (RA-DPL) framework for the discrim-inative sparse representation
learning. To achieve powerful representation ability of the available samples,
the setting of RA-DPL seamlessly integrates the robust projective dictionary
pair learning, locality-adaptive sparse representations and discriminative
coding coefficients learning into a unified learning framework. Specifically,
RA-DPL improves existing projective dictionary pair learning in four
perspectives. First, it applies a sparse l2,1-norm based metric to encode the
recon-struction error to deliver the robust projective dictionary pairs, and
the l2,1-norm has the potential to minimize the error. Sec-ond, it imposes the
robust l2,1-norm clearly on the analysis dictionary to ensure the sparse
property of the coding coeffi-cients rather than using the costly l0/l1-norm.
As such, the robustness of the data representation and the efficiency of the
learning process are jointly considered to guarantee the effi-cacy of our
RA-DPL. Third, RA-DPL conceives a structured reconstruction weight learning
paradigm to preserve the local structures of the coding coefficients within
each class clearly in an adaptive manner, which encourages to produce the
locality preserving representations. Fourth, it also considers improving the
discriminating ability of coding coefficients and dictionary by incorporating a
discriminating function, which can ensure high intra-class compactness and
inter-class separation in the code space. Extensive experiments show that our
RA-DPL can obtain superior performance over other state-of-the-arts.Comment: Accepted by IEEE TNNL
Joint Subspace Recovery and Enhanced Locality Driven Robust Flexible Discriminative Dictionary Learning
We propose a joint subspace recovery and enhanced locality based robust
flexible label consistent dictionary learning method called Robust Flexible
Discriminative Dictionary Learning (RFDDL). RFDDL mainly improves the data
representation and classification abilities by enhancing the robust property to
sparse errors and encoding the locality, reconstruction error and label
consistency more accurately. First, for the robustness to noise and sparse
errors in data and atoms, RFDDL aims at recovering the underlying clean data
and clean atom subspaces jointly, and then performs DL and encodes the locality
in the recovered subspaces. Second, to enable the data sampled from a nonlinear
manifold to be handled potentially and obtain the accurate reconstruction by
avoiding the overfitting, RFDDL minimizes the reconstruction error in a
flexible manner. Third, to encode the label consistency accurately, RFDDL
involves a discriminative flexible sparse code error to encourage the
coefficients to be soft. Fourth, to encode the locality well, RFDDL defines the
Laplacian matrix over recovered atoms, includes label information of atoms in
terms of intra-class compactness and inter-class separation, and associates
with group sparse codes and classifier to obtain the accurate discriminative
locality-constrained coefficients and classifier. Extensive results on public
databases show the effectiveness of our RFDDL.Comment: Accepted by IEEE TCSVT 201
Transductive Zero-Shot Learning with a Self-training dictionary approach
As an important and challenging problem in computer vision, zero-shot
learning (ZSL) aims at automatically recognizing the instances from unseen
object classes without training data. To address this problem, ZSL is usually
carried out in the following two aspects: 1) capturing the domain distribution
connections between seen classes data and unseen classes data; and 2) modeling
the semantic interactions between the image feature space and the label
embedding space. Motivated by these observations, we propose a bidirectional
mapping based semantic relationship modeling scheme that seeks for crossmodal
knowledge transfer by simultaneously projecting the image features and label
embeddings into a common latent space. Namely, we have a bidirectional
connection relationship that takes place from the image feature space to the
latent space as well as from the label embedding space to the latent space. To
deal with the domain shift problem, we further present a transductive learning
approach that formulates the class prediction problem in an iterative refining
process, where the object classification capacity is progressively reinforced
through bootstrapping-based model updating over highly reliable instances.
Experimental results on three benchmark datasets (AwA, CUB and SUN) demonstrate
the effectiveness of the proposed approach against the state-of-the-art
approaches
Learning Hybrid Representation by Robust Dictionary Learning in Factorized Compressed Space
In this paper, we investigate the robust dictionary learning (DL) to discover
the hybrid salient low-rank and sparse representation in a factorized
compressed space. A Joint Robust Factorization and Projective Dictionary
Learning (J-RFDL) model is presented. The setting of J-RFDL aims at improving
the data representations by enhancing the robustness to outliers and noise in
data, encoding the reconstruction error more accurately and obtaining hybrid
salient coefficients with accurate reconstruction ability. Specifically, J-RFDL
performs the robust representation by DL in a factorized compressed space to
eliminate the negative effects of noise and outliers on the results, which can
also make the DL process efficient. To make the encoding process robust to
noise in data, J-RFDL clearly uses sparse L2, 1-norm that can potentially
minimize the factorization and reconstruction errors jointly by forcing rows of
the reconstruction errors to be zeros. To deliver salient coefficients with
good structures to reconstruct given data well, J-RFDL imposes the joint
low-rank and sparse constraints on the embedded coefficients with a synthesis
dictionary. Based on the hybrid salient coefficients, we also extend J-RFDL for
the joint classification and propose a discriminative J-RFDL model, which can
improve the discriminating abilities of learnt coeffi-cients by minimizing the
classification error jointly. Extensive experiments on public datasets
demonstrate that our formulations can deliver superior performance over other
state-of-the-art methods.Comment: Accepted by IEEE TI
Convolutional Dictionary Pair Learning Network for Image Representation Learning
Both the Dictionary Learning (DL) and Convolutional Neural Networks (CNN) are
powerful image representation learning systems based on different mechanisms
and principles, however whether we can seamlessly integrate them to improve the
per-formance is noteworthy exploring. To address this issue, we propose a novel
generalized end-to-end representation learning architecture, dubbed
Convolutional Dictionary Pair Learning Network (CDPL-Net) in this paper, which
integrates the learning schemes of the CNN and dictionary pair learning into a
unified framework. Generally, the architecture of CDPL-Net includes two
convolutional/pooling layers and two dictionary pair learn-ing (DPL) layers in
the representation learning module. Besides, it uses two fully-connected layers
as the multi-layer perception layer in the nonlinear classification module. In
particular, the DPL layer can jointly formulate the discriminative synthesis
and analysis representations driven by minimizing the batch based
reconstruction error over the flatted feature maps from the convolution/pooling
layer. Moreover, DPL layer uses l1-norm on the analysis dictionary so that
sparse representation can be delivered, and the embedding process will also be
robust to noise. To speed up the training process of DPL layer, the efficient
stochastic gradient descent is used. Extensive simulations on real databases
show that our CDPL-Net can deliver enhanced performance over other
state-of-the-art methods.Comment: Accepted by the 24th European Conference on Artificial Intelligence
(ECAI 2020
Elastic Functional Coding of Riemannian Trajectories
Visual observations of dynamic phenomena, such as human actions, are often
represented as sequences of smoothly-varying features . In cases where the
feature spaces can be structured as Riemannian manifolds, the corresponding
representations become trajectories on manifolds. Analysis of these
trajectories is challenging due to non-linearity of underlying spaces and
high-dimensionality of trajectories. In vision problems, given the nature of
physical systems involved, these phenomena are better characterized on a
low-dimensional manifold compared to the space of Riemannian trajectories. For
instance, if one does not impose physical constraints of the human body, in
data involving human action analysis, the resulting representation space will
have highly redundant features. Learning an effective, low-dimensional
embedding for action representations will have a huge impact in the areas of
search and retrieval, visualization, learning, and recognition. The difficulty
lies in inherent non-linearity of the domain and temporal variability of
actions that can distort any traditional metric between trajectories. To
overcome these issues, we use the framework based on transported square-root
velocity fields (TSRVF); this framework has several desirable properties,
including a rate-invariant metric and vector space representations. We propose
to learn an embedding such that each action trajectory is mapped to a single
point in a low-dimensional Euclidean space, and the trajectories that differ
only in temporal rates map to the same point. We utilize the TSRVF
representation, and accompanying statistical summaries of Riemannian
trajectories, to extend existing coding methods such as PCA, KSVD and Label
Consistent KSVD to Riemannian trajectories or more generally to Riemannian
functions.Comment: Under major revision at IEEE T-PAMI, 201
- …