993 research outputs found
A Nonlinear Orthogonal Non-Negative Matrix Factorization Approach to Subspace Clustering
A recent theoretical analysis shows the equivalence between non-negative
matrix factorization (NMF) and spectral clustering based approach to subspace
clustering. As NMF and many of its variants are essentially linear, we
introduce a nonlinear NMF with explicit orthogonality and derive general
kernel-based orthogonal multiplicative update rules to solve the subspace
clustering problem. In nonlinear orthogonal NMF framework, we propose two
subspace clustering algorithms, named kernel-based non-negative subspace
clustering KNSC-Ncut and KNSC-Rcut and establish their connection with spectral
normalized cut and ratio cut clustering. We further extend the nonlinear
orthogonal NMF framework and introduce a graph regularization to obtain a
factorization that respects a local geometric structure of the data after the
nonlinear mapping. The proposed NMF-based approach to subspace clustering takes
into account the nonlinear nature of the manifold, as well as its intrinsic
local geometry, which considerably improves the clustering performance when
compared to the several recently proposed state-of-the-art methods
Discriminative Sparse Coding on Multi-Manifold for Data Representation and Classification
Sparse coding has been popularly used as an effective data representation
method in various applications, such as computer vision, medical imaging and
bioinformatics, etc. However, the conventional sparse coding algorithms and its
manifold regularized variants (graph sparse coding and Laplacian sparse
coding), learn the codebook and codes in a unsupervised manner and neglect the
class information available in the training set. To address this problem, in
this paper we propose a novel discriminative sparse coding method based on
multi-manifold, by learning discriminative class-conditional codebooks and
sparse codes from both data feature space and class labels. First, the entire
training set is partitioned into multiple manifolds according to the class
labels. Then, we formulate the sparse coding as a manifold-manifold matching
problem and learn class-conditional codebooks and codes to maximize the
manifold margins of different classes. Lastly, we present a data point-manifold
matching error based strategy to classify the unlabeled data point.
Experimental results on somatic mutations identification and breast tumors
classification in ultrasonic images tasks demonstrate the efficacy of the
proposed data representation-classification approach.Comment: This paper has been withdrawn by the author due to the terrible
writin
Robust Unsupervised Flexible Auto-weighted Local-Coordinate Concept Factorization for Image Clustering
We investigate the high-dimensional data clustering problem by proposing a
novel and unsupervised representation learning model called Robust Flexible
Auto-weighted Local-coordinate Concept Factorization (RFA-LCF). RFA-LCF
integrates the robust flexible CF, robust sparse local-coordinate coding and
the adaptive reconstruction weighting learning into a unified model. The
adaptive weighting is driven by including the joint manifold preserving
constraints on the recovered clean data, basis concepts and new representation.
Specifically, our RFA-LCF uses a L2,1-norm based flexible residue to encode the
mismatch between clean data and its reconstruction, and also applies the robust
adaptive sparse local-coordinate coding to represent the data using a few
nearby basis concepts, which can make the factorization more accurate and
robust to noise. The robust flexible factorization is also performed in the
recovered clean data space for enhancing representations. RFA-LCF also
considers preserving the local manifold structures of clean data space, basis
concept space and the new coordinate space jointly in an adaptive manner way.
Extensive comparisons show that RFA-LCF can deliver enhanced clustering
results.Comment: Accepted at the 44th IEEE International Conference on Acoustics,
Speech, and Signal Processing(ICASSP 2019
A Survey on Multi-View Clustering
With advances in information acquisition technologies, multi-view data become
ubiquitous. Multi-view learning has thus become more and more popular in
machine learning and data mining fields. Multi-view unsupervised or
semi-supervised learning, such as co-training, co-regularization has gained
considerable attention. Although recently, multi-view clustering (MVC) methods
have been developed rapidly, there has not been a survey to summarize and
analyze the current progress. Therefore, this paper reviews the common
strategies for combining multiple views of data and based on this summary we
propose a novel taxonomy of the MVC approaches. We further discuss the
relationships between MVC and multi-view representation, ensemble clustering,
multi-task clustering, multi-view supervised and semi-supervised learning.
Several representative real-world applications are elaborated. To promote
future development of MVC, we envision several open problems that may require
further investigation and thorough examination.Comment: 17 pages, 4 figure
Incomplete Multi-view Clustering via Graph Regularized Matrix Factorization
Clustering with incomplete views is a challenge in multi-view clustering. In
this paper, we provide a novel and simple method to address this issue.
Specifically, the proposed method simultaneously exploits the local information
of each view and the complementary information among views to learn the common
latent representation for all samples, which can greatly improve the
compactness and discriminability of the obtained representation. Compared with
the conventional graph embedding methods, the proposed method does not
introduce any extra regularization term and corresponding penalty parameter to
preserve the local structure of data, and thus does not increase the burden of
extra parameter selection. By imposing the orthogonal constraint on the basis
matrix of each view, the proposed method is able to handle the out-of-sample.
Moreover, the proposed method can be viewed as a unified framework for
multi-view learning since it can handle both incomplete and complete multi-view
clustering and classification tasks. Extensive experiments conducted on several
multi-view datasets prove that the proposed method can significantly improve
the clustering performance.Comment: ECCV 2018 International Workshop on Compact and Efficient Feature
Representation and Learning in Computer Vision (CEFRL
Sparse Deep Nonnegative Matrix Factorization
Nonnegative matrix factorization is a powerful technique to realize dimension
reduction and pattern recognition through single-layer data representation
learning. Deep learning, however, with its carefully designed hierarchical
structure, is able to combine hidden features to form more representative
features for pattern recognition. In this paper, we proposed sparse deep
nonnegative matrix factorization models to analyze complex data for more
accurate classification and better feature interpretation. Such models are
designed to learn localized features or generate more discriminative
representations for samples in distinct classes by imposing -norm penalty
on the columns of certain factors. By extending one-layer model into
multi-layer one with sparsity, we provided a hierarchical way to analyze big
data and extract hidden features intuitively due to nonnegativity. We adopted
the Nesterov's accelerated gradient algorithm to accelerate the computing
process with the convergence rate of after steps iteration. We
also analyzed the computing complexity of our framework to demonstrate their
efficiency. To improve the performance of dealing with linearly inseparable
data, we also considered to incorporate popular nonlinear functions into this
framework and explored their performance. We applied our models onto two
benchmarking image datasets, demonstrating our models can achieve competitive
or better classification performance and produce intuitive interpretations
compared with the typical NMF and competing multi-layer models.Comment: 13 pages, 8 figure
Feature Concatenation Multi-view Subspace Clustering
Multi-view clustering aims to achieve more promising clustering results than
single-view clustering by exploring the multi-view information. Since statistic
properties of different views are diverse, even incompatible, few approaches
implement multi-view clustering based on the concatenated features directly.
However, feature concatenation is a natural way to combine multiple views. To
this end, this paper proposes a novel multi-view subspace clustering approach
dubbed Feature Concatenation Multi-view Subspace Clustering (FCMSC).
Specifically, by exploring the consensus information, multi-view data are
concatenated into a joint representation firstly, then, -norm is
integrated into the objective function to deal with the sample-specific and
cluster-specific corruptions of multiple views for benefiting the clustering
performance. Furthermore, by introducing graph Laplacians of multiple views, a
graph regularized FCMSC is also introduced to explore both the consensus
information and complementary information for clustering. It is noteworthy that
the obtained coefficient matrix is not derived by directly applying the
Low-Rank Representation (LRR) to the joint view representation simply. Finally,
an effective algorithm based on the Augmented Lagrangian Multiplier (ALM) is
designed to optimized the objective functions. Comprehensive experiments on six
real world datasets illustrate the superiority of the proposed methods over
several state-of-the-art approaches for multi-view clustering
Global and Local Structure Preserving Sparse Subspace Learning: An Iterative Approach to Unsupervised Feature Selection
As we aim at alleviating the curse of high-dimensionality, subspace learning
is becoming more popular. Existing approaches use either information about
global or local structure of the data, and few studies simultaneously focus on
global and local structures as the both of them contain important information.
In this paper, we propose a global and local structure preserving sparse
subspace learning (GLoSS) model for unsupervised feature selection. The model
can simultaneously realize feature selection and subspace learning. In
addition, we develop a greedy algorithm to establish a generic combinatorial
model, and an iterative strategy based on an accelerated block coordinate
descent is used to solve the GLoSS problem. We also provide whole iterate
sequence convergence analysis of the proposed iterative algorithm. Extensive
experiments are conducted on real-world datasets to show the superiority of the
proposed approach over several state-of-the-art unsupervised feature selection
approaches.Comment: 32 page, 6 figures and 60 reference
Joint Label Prediction based Semi-Supervised Adaptive Concept Factorization for Robust Data Representation
Constrained Concept Factorization (CCF) yields the enhanced representation
ability over CF by incorporating label information as additional constraints,
but it cannot classify and group unlabeled data appropriately. Minimizing the
difference between the original data and its reconstruction directly can enable
CCF to model a small noisy perturbation, but is not robust to gross sparse
errors. Besides, CCF cannot preserve the manifold structures in new
representation space explicitly, especially in an adaptive manner. In this
paper, we propose a joint label prediction based Robust Semi-Supervised
Adaptive Concept Factorization (RS2ACF) framework. To obtain robust
representation, RS2ACF relaxes the factorization to make it simultaneously
stable to small entrywise noise and robust to sparse errors. To enrich prior
knowledge to enhance the discrimination, RS2ACF clearly uses class information
of labeled data and more importantly propagates it to unlabeled data by jointly
learning an explicit label indicator for unlabeled data. By the label
indicator, RS2ACF can ensure the unlabeled data of the same predicted label to
be mapped into the same class in feature space. Besides, RS2ACF incorporates
the joint neighborhood reconstruction error over the new representations and
predicted labels of both labeled and unlabeled data, so the manifold structures
can be preserved explicitly and adaptively in the representation space and
label space at the same time. Owing to the adaptive manner, the tricky process
of determining the neighborhood size or kernel width can be avoided. Extensive
results on public databases verify that our RS2ACF can deliver state-of-the-art
data representation, compared with other related methods.Comment: Accepted at IEEE TKD
Maximum mutual information regularized classification
In this paper, a novel pattern classification approach is proposed by
regularizing the classifier learning to maximize mutual information between the
classification response and the true class label. We argue that, with the
learned classifier, the uncertainty of the true class label of a data sample
should be reduced by knowing its classification response as much as possible.
The reduced uncertainty is measured by the mutual information between the
classification response and the true class label. To this end, when learning a
linear classifier, we propose to maximize the mutual information between
classification responses and true class labels of training samples, besides
minimizing the classification error and reduc- ing the classifier complexity.
An objective function is constructed by modeling mutual information with
entropy estimation, and it is optimized by a gradi- ent descend method in an
iterative algorithm. Experiments on two real world pattern classification
problems show the significant improvements achieved by maximum mutual
information regularization
- …