10 research outputs found

    Subspace clustering via good neighbors

    Get PDF
    Finding the informative clusters of a high-dimensional dataset is at the core of numerous applications in computer vision, where spectral based subspace clustering algorithm is arguably the most widely-studied methods due to its empirical performance and provable guarantees under various assumptions. It is well-known that sparsity and connectivity of the affinity graph play important rules for effective subspace clustering. However, it is difficult to simultaneously optimize both factors due to their conflicting nature, and most existing methods are designed to deal with only one factor. In this paper, we propose an algorithm to optimize both sparsity and connectivity by finding good neighbors which induce key connections among samples within a subspace. First, an initial coefficient matrix is generated from the input dataset. For each sample, we find its good neighbors which not only have large coefficients but are strongly connected to each other. We reassign the coefficients of good neighbors and eliminate other entries to generate a new coefficient matrix, which can be used by spectral clustering methods. Experiments on five benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of accuracy with a negligible increase in speed

    Convex Subspace Clustering by Adaptive Block Diagonal Representation

    Full text link
    Subspace clustering is a class of extensively studied clustering methods and the spectral-type approaches are its important subclass whose key first step is to learn a coefficient matrix with block diagonal structure. To realize this step, sparse subspace clustering (SSC), low rank representation (LRR) and block diagonal representation (BDR) were successively proposed and have become the state-of-the-arts (SOTAs). Among them, the former two minimize their convex objectives by imposing sparsity and low rankness on the coefficient matrix respectively, but so-desired block diagonality cannot neccesarily be guaranteed practically while the latter designs a block diagonal matrix induced regularizer but sacrifices convexity. For solving this dilemma, inspired by Convex Biclustering, in this paper, we propose a simple yet efficient spectral-type subspace clustering method named Adaptive Block Diagonal Representation (ABDR) which strives to pursue so-desired block diagonality as BDR by coercively fusing the columns/rows of the coefficient matrix via a specially designed convex regularizer, consequently, ABDR naturally enjoys their merits and can adaptively form more desired block diagonality than the SOTAs without needing to prefix the number of blocks as done in BDR. Finally, experimental results on synthetic and real benchmarks demonstrate the superiority of ABDR.Comment: 13 pages, 11 figures, 8 table

    Efficient Multi-View Graph Clustering with Local and Global Structure Preservation

    Full text link
    Anchor-based multi-view graph clustering (AMVGC) has received abundant attention owing to its high efficiency and the capability to capture complementary structural information across multiple views. Intuitively, a high-quality anchor graph plays an essential role in the success of AMVGC. However, the existing AMVGC methods only consider single-structure information, i.e., local or global structure, which provides insufficient information for the learning task. To be specific, the over-scattered global structure leads to learned anchors failing to depict the cluster partition well. In contrast, the local structure with an improper similarity measure results in potentially inaccurate anchor assignment, ultimately leading to sub-optimal clustering performance. To tackle the issue, we propose a novel anchor-based multi-view graph clustering framework termed Efficient Multi-View Graph Clustering with Local and Global Structure Preservation (EMVGC-LG). Specifically, a unified framework with a theoretical guarantee is designed to capture local and global information. Besides, EMVGC-LG jointly optimizes anchor construction and graph learning to enhance the clustering quality. In addition, EMVGC-LG inherits the linear complexity of existing AMVGC methods respecting the sample number, which is time-economical and scales well with the data size. Extensive experiments demonstrate the effectiveness and efficiency of our proposed method.Comment: arXiv admin note: text overlap with arXiv:2308.1654

    K-Deep Simplex: Deep Manifold Learning via Local Dictionaries

    Full text link
    We propose K-Deep Simplex (KDS), a unified optimization framework for nonlinear dimensionality reduction that combines the strengths of manifold learning and sparse dictionary learning. Our approach learns local dictionaries that represent a data point with reconstruction coefficients supported on the probability simplex. The dictionaries are learned using algorithm unrolling, an increasingly popular technique for structured deep learning. KDS enjoys tremendous computational advantages over related approaches and is both interpretable and flexible. In particular, KDS is quasilinear in the number of data points with scaling that depends on intrinsic geometric properties of the data. We apply KDS to the unsupervised clustering problem and prove theoretical performance guarantees. Experiments show that the algorithm is highly efficient and performs competitively on synthetic and real data sets.Comment: 14 pages, 6 figure

    Revisiting data augmentation for subspace clustering

    Full text link
    Subspace clustering is the classical problem of clustering a collection of data samples that approximately lie around several low-dimensional subspaces. The current state-of-the-art approaches for this problem are based on the self-expressive model which represents the samples as linear combination of other samples. However, these approaches require sufficiently well-spread samples for accurate representation which might not be necessarily accessible in many applications. In this paper, we shed light on this commonly neglected issue and argue that data distribution within each subspace plays a critical role in the success of self-expressive models. Our proposed solution to tackle this issue is motivated by the central role of data augmentation in the generalization power of deep neural networks. We propose two subspace clustering frameworks for both unsupervised and semi-supervised settings that use augmented samples as an enlarged dictionary to improve the quality of the self-expressive representation. We present an automatic augmentation strategy using a few labeled samples for the semi-supervised problem relying on the fact that the data samples lie in the union of multiple linear subspaces. Experimental results confirm the effectiveness of data augmentation, as it significantly improves the performance of general self-expressive models.Comment: 38 pages (including 10 of supplementary material

    A three-step classification framework to handle complex data distribution for radar UAV detection

    Get PDF
    Unmanned aerial vehicles (UAVs) have been used in a wide range of applications and become an increasingly important radar target. To better model radar data and to tackle the curse of dimensionality, a three-step classification framework is proposed for UAV detection. First we propose to utilize the greedy subspace clustering to handle potential outliers and the complex sample distribution of radar data. Parameters of the resulting multi-Gaussian model, especially the covariance matrices, could not be reliably estimated due to insufficient training samples and the high dimensionality. Thus, in the second step, a multi-Gaussian subspace reliability analysis is proposed to handle the unreliable feature dimensions of these covariance matrices. To address the challenges of classifying samples using the complex multi-Gaussian model and to fuse the distances of a sample to different clusters at different dimensionalities, a subspace-fusion scheme is proposed in the third step. The proposed approach is validated on a large benchmark dataset, which significantly outperforms the state-of-the-art approaches
    corecore