Search CORE

993 research outputs found

A Nonlinear Orthogonal Non-Negative Matrix Factorization Approach to Subspace Clustering

Author: Antulov-Fantulin Nino
Kopriva Ivica
Tolic Dijana
Publication venue: 'Elsevier BV'
Publication date: 29/09/2017
Field of study

A recent theoretical analysis shows the equivalence between non-negative matrix factorization (NMF) and spectral clustering based approach to subspace clustering. As NMF and many of its variants are essentially linear, we introduce a nonlinear NMF with explicit orthogonality and derive general kernel-based orthogonal multiplicative update rules to solve the subspace clustering problem. In nonlinear orthogonal NMF framework, we propose two subspace clustering algorithms, named kernel-based non-negative subspace clustering KNSC-Ncut and KNSC-Rcut and establish their connection with spectral normalized cut and ratio cut clustering. We further extend the nonlinear orthogonal NMF framework and introduce a graph regularization to obtain a factorization that respects a local geometric structure of the data after the nonlinear mapping. The proposed NMF-based approach to subspace clustering takes into account the nonlinear nature of the manifold, as well as its intrinsic local geometry, which considerably improves the clustering performance when compared to the several recently proposed state-of-the-art methods

arXiv.org e-Print Archive

Full-text Institutional Repository of the Ruđer Bošković Institute

Discriminative Sparse Coding on Multi-Manifold for Data Representation and Classification

Author: Wang Jing-Yan
Publication venue
Publication date: 03/04/2013
Field of study

Sparse coding has been popularly used as an effective data representation method in various applications, such as computer vision, medical imaging and bioinformatics, etc. However, the conventional sparse coding algorithms and its manifold regularized variants (graph sparse coding and Laplacian sparse coding), learn the codebook and codes in a unsupervised manner and neglect the class information available in the training set. To address this problem, in this paper we propose a novel discriminative sparse coding method based on multi-manifold, by learning discriminative class-conditional codebooks and sparse codes from both data feature space and class labels. First, the entire training set is partitioned into multiple manifolds according to the class labels. Then, we formulate the sparse coding as a manifold-manifold matching problem and learn class-conditional codebooks and codes to maximize the manifold margins of different classes. Lastly, we present a data point-manifold matching error based strategy to classify the unlabeled data point. Experimental results on somatic mutations identification and breast tumors classification in ultrasonic images tasks demonstrate the efficacy of the proposed data representation-classification approach.Comment: This paper has been withdrawn by the author due to the terrible writin

arXiv.org e-Print Archive

Robust Unsupervised Flexible Auto-weighted Local-Coordinate Concept Factorization for Image Clustering

Author: Li Sheng
Liu Guangcan
Wang Meng
Yan Shuicheng
Zhang Yan
Zhang Zhao
Publication venue
Publication date: 25/05/2019
Field of study

We investigate the high-dimensional data clustering problem by proposing a novel and unsupervised representation learning model called Robust Flexible Auto-weighted Local-coordinate Concept Factorization (RFA-LCF). RFA-LCF integrates the robust flexible CF, robust sparse local-coordinate coding and the adaptive reconstruction weighting learning into a unified model. The adaptive weighting is driven by including the joint manifold preserving constraints on the recovered clean data, basis concepts and new representation. Specifically, our RFA-LCF uses a L2,1-norm based flexible residue to encode the mismatch between clean data and its reconstruction, and also applies the robust adaptive sparse local-coordinate coding to represent the data using a few nearby basis concepts, which can make the factorization more accurate and robust to noise. The robust flexible factorization is also performed in the recovered clean data space for enhancing representations. RFA-LCF also considers preserving the local manifold structures of clean data space, basis concept space and the new coordinate space jointly in an adaptive manner way. Extensive comparisons show that RFA-LCF can deliver enhanced clustering results.Comment: Accepted at the 44th IEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP 2019

arXiv.org e-Print Archive

A Survey on Multi-View Clustering

Author: Bi Jinbo
Chao Guoqing
Sun Shiliang
Publication venue
Publication date: 03/04/2018
Field of study

With advances in information acquisition technologies, multi-view data become ubiquitous. Multi-view learning has thus become more and more popular in machine learning and data mining fields. Multi-view unsupervised or semi-supervised learning, such as co-training, co-regularization has gained considerable attention. Although recently, multi-view clustering (MVC) methods have been developed rapidly, there has not been a survey to summarize and analyze the current progress. Therefore, this paper reviews the common strategies for combining multiple views of data and based on this summary we propose a novel taxonomy of the MVC approaches. We further discuss the relationships between MVC and multi-view representation, ensemble clustering, multi-task clustering, multi-view supervised and semi-supervised learning. Several representative real-world applications are elaborated. To promote future development of MVC, we envision several open problems that may require further investigation and thorough examination.Comment: 17 pages, 4 figure

arXiv.org e-Print Archive

Incomplete Multi-view Clustering via Graph Regularized Matrix Factorization

Author: Wen Jie
Xu Yong
Zhang Zheng
Zhong Zuofeng
Publication venue
Publication date: 16/09/2018
Field of study

Clustering with incomplete views is a challenge in multi-view clustering. In this paper, we provide a novel and simple method to address this issue. Specifically, the proposed method simultaneously exploits the local information of each view and the complementary information among views to learn the common latent representation for all samples, which can greatly improve the compactness and discriminability of the obtained representation. Compared with the conventional graph embedding methods, the proposed method does not introduce any extra regularization term and corresponding penalty parameter to preserve the local structure of data, and thus does not increase the burden of extra parameter selection. By imposing the orthogonal constraint on the basis matrix of each view, the proposed method is able to handle the out-of-sample. Moreover, the proposed method can be viewed as a unified framework for multi-view learning since it can handle both incomplete and complete multi-view clustering and classification tasks. Extensive experiments conducted on several multi-view datasets prove that the proposed method can significantly improve the clustering performance.Comment: ECCV 2018 International Workshop on Compact and Efficient Feature Representation and Learning in Computer Vision (CEFRL

arXiv.org e-Print Archive

Sparse Deep Nonnegative Matrix Factorization

Author: Guo Zhenxing
Zhang Shihua
Publication venue
Publication date: 28/07/2017
Field of study

Nonnegative matrix factorization is a powerful technique to realize dimension reduction and pattern recognition through single-layer data representation learning. Deep learning, however, with its carefully designed hierarchical structure, is able to combine hidden features to form more representative features for pattern recognition. In this paper, we proposed sparse deep nonnegative matrix factorization models to analyze complex data for more accurate classification and better feature interpretation. Such models are designed to learn localized features or generate more discriminative representations for samples in distinct classes by imposing

L_1

-norm penalty on the columns of certain factors. By extending one-layer model into multi-layer one with sparsity, we provided a hierarchical way to analyze big data and extract hidden features intuitively due to nonnegativity. We adopted the Nesterov's accelerated gradient algorithm to accelerate the computing process with the convergence rate of

O(1/k^2)

after

k

steps iteration. We also analyzed the computing complexity of our framework to demonstrate their efficiency. To improve the performance of dealing with linearly inseparable data, we also considered to incorporate popular nonlinear functions into this framework and explored their performance. We applied our models onto two benchmarking image datasets, demonstrating our models can achieve competitive or better classification performance and produce intuitive interpretations compared with the typical NMF and competing multi-layer models.Comment: 13 pages, 8 figure

arXiv.org e-Print Archive

Feature Concatenation Multi-view Subspace Clustering

Author: Li Yaochen
Li Zhongyu
Pang Shanmin
Wang Jun
Zheng Qinghai
Zhu Jihua
Publication venue
Publication date: 05/06/2019
Field of study

Multi-view clustering aims to achieve more promising clustering results than single-view clustering by exploring the multi-view information. Since statistic properties of different views are diverse, even incompatible, few approaches implement multi-view clustering based on the concatenated features directly. However, feature concatenation is a natural way to combine multiple views. To this end, this paper proposes a novel multi-view subspace clustering approach dubbed Feature Concatenation Multi-view Subspace Clustering (FCMSC). Specifically, by exploring the consensus information, multi-view data are concatenated into a joint representation firstly, then,

l_{2,1}

-norm is integrated into the objective function to deal with the sample-specific and cluster-specific corruptions of multiple views for benefiting the clustering performance. Furthermore, by introducing graph Laplacians of multiple views, a graph regularized FCMSC is also introduced to explore both the consensus information and complementary information for clustering. It is noteworthy that the obtained coefficient matrix is not derived by directly applying the Low-Rank Representation (LRR) to the joint view representation simply. Finally, an effective algorithm based on the Augmented Lagrangian Multiplier (ALM) is designed to optimized the objective functions. Comprehensive experiments on six real world datasets illustrate the superiority of the proposed methods over several state-of-the-art approaches for multi-view clustering

arXiv.org e-Print Archive

Global and Local Structure Preserving Sparse Subspace Learning: An Iterative Approach to Unsupervised Feature Selection

Author: Cheng Hong
Fang Jun
Pedrycz Witold
Xu Yangyang
Zhou Nan
Publication venue
Publication date: 19/10/2015
Field of study

As we aim at alleviating the curse of high-dimensionality, subspace learning is becoming more popular. Existing approaches use either information about global or local structure of the data, and few studies simultaneously focus on global and local structures as the both of them contain important information. In this paper, we propose a global and local structure preserving sparse subspace learning (GLoSS) model for unsupervised feature selection. The model can simultaneously realize feature selection and subspace learning. In addition, we develop a greedy algorithm to establish a generic combinatorial model, and an iterative strategy based on an accelerated block coordinate descent is used to solve the GLoSS problem. We also provide whole iterate sequence convergence analysis of the proposed iterative algorithm. Extensive experiments are conducted on real-world datasets to show the superiority of the proposed approach over several state-of-the-art unsupervised feature selection approaches.Comment: 32 page, 6 figures and 60 reference

arXiv.org e-Print Archive

Joint Label Prediction based Semi-Supervised Adaptive Concept Factorization for Robust Data Representation

Author: Liu Guangcan
Tang Jinhui
Wang Meng
Yan Shuicheng
Zhang Yan
Zhang Zhao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/05/2019
Field of study

Constrained Concept Factorization (CCF) yields the enhanced representation ability over CF by incorporating label information as additional constraints, but it cannot classify and group unlabeled data appropriately. Minimizing the difference between the original data and its reconstruction directly can enable CCF to model a small noisy perturbation, but is not robust to gross sparse errors. Besides, CCF cannot preserve the manifold structures in new representation space explicitly, especially in an adaptive manner. In this paper, we propose a joint label prediction based Robust Semi-Supervised Adaptive Concept Factorization (RS2ACF) framework. To obtain robust representation, RS2ACF relaxes the factorization to make it simultaneously stable to small entrywise noise and robust to sparse errors. To enrich prior knowledge to enhance the discrimination, RS2ACF clearly uses class information of labeled data and more importantly propagates it to unlabeled data by jointly learning an explicit label indicator for unlabeled data. By the label indicator, RS2ACF can ensure the unlabeled data of the same predicted label to be mapped into the same class in feature space. Besides, RS2ACF incorporates the joint neighborhood reconstruction error over the new representations and predicted labels of both labeled and unlabeled data, so the manifold structures can be preserved explicitly and adaptively in the representation space and label space at the same time. Owing to the adaptive manner, the tricky process of determining the neighborhood size or kernel width can be avoided. Extensive results on public databases verify that our RS2ACF can deliver state-of-the-art data representation, compared with other related methods.Comment: Accepted at IEEE TKD

arXiv.org e-Print Archive

Maximum mutual information regularized classification

Author: Gao Xin
Wang Jim Jing-Yan
Wang Yi
Zhao Shiguang
Publication venue
Publication date: 27/09/2014
Field of study

In this paper, a novel pattern classification approach is proposed by regularizing the classifier learning to maximize mutual information between the classification response and the true class label. We argue that, with the learned classifier, the uncertainty of the true class label of a data sample should be reduced by knowing its classification response as much as possible. The reduced uncertainty is measured by the mutual information between the classification response and the true class label. To this end, when learning a linear classifier, we propose to maximize the mutual information between classification responses and true class labels of training samples, besides minimizing the classification error and reduc- ing the classifier complexity. An objective function is constructed by modeling mutual information with entropy estimation, and it is optimized by a gradi- ent descend method in an iterative algorithm. Experiments on two real world pattern classification problems show the significant improvements achieved by maximum mutual information regularization

arXiv.org e-Print Archive