160 research outputs found

    Attribute Graph Clustering via Learnable Augmentation

    Full text link
    Contrastive deep graph clustering (CDGC) utilizes contrastive learning to group nodes into different clusters. Better augmentation techniques benefit the quality of the contrastive samples, thus being one of key factors to improve performance. However, the augmentation samples in existing methods are always predefined by human experiences, and agnostic from the downstream task clustering, thus leading to high human resource costs and poor performance. To this end, we propose an Attribute Graph Clustering method via Learnable Augmentation (\textbf{AGCLA}), which introduces learnable augmentors for high-quality and suitable augmented samples for CDGC. Specifically, we design two learnable augmentors for attribute and structure information, respectively. Besides, two refinement matrices, including the high-confidence pseudo-label matrix and the cross-view sample similarity matrix, are generated to improve the reliability of the learned affinity matrix. During the training procedure, we notice that there exist differences between the optimization goals for training learnable augmentors and contrastive learning networks. In other words, we should both guarantee the consistency of the embeddings as well as the diversity of the augmented samples. Thus, an adversarial learning mechanism is designed in our method. Moreover, a two-stage training strategy is leveraged for the high-confidence refinement matrices. Extensive experimental results demonstrate the effectiveness of AGCLA on six benchmark datasets

    Visual Tactile Fusion Object Clustering

    Full text link
    Object clustering, aiming at grouping similar objects into one cluster with an unsupervised strategy, has been extensivelystudied among various data-driven applications. However, most existing state-of-the-art object clustering methods (e.g., single-view or multi-view clustering methods) only explore visual information, while ignoring one of most important sensing modalities, i.e., tactile information which can help capture different object properties and further boost the performance of object clustering task. To effectively benefit both visual and tactile modalities for object clustering, in this paper, we propose a deep Auto-Encoder-like Non-negative Matrix Factorization framework for visual-tactile fusion clustering. Specifically, deep matrix factorization constrained by an under-complete Auto-Encoder-like architecture is employed to jointly learn hierarchical expression of visual-tactile fusion data, and preserve the local structure of data generating distribution of visual and tactile modalities. Meanwhile, a graph regularizer is introduced to capture the intrinsic relations of data samples within each modality. Furthermore, we propose a modality-level consensus regularizer to effectively align thevisual and tactile data in a common subspace in which the gap between visual and tactile data is mitigated. For the model optimization, we present an efficient alternating minimization strategy to solve our proposed model. Finally, we conduct extensive experiments on public datasets to verify the effectiveness of our framework.Comment: 8 pages, 5 figure

    Latent Distribution Preserving Deep Subspace Clustering

    Get PDF

    Unsupervised Learning from Shollow to Deep

    Get PDF
    Machine learning plays a pivotal role in most state-of-the-art systems in many application research domains. With the rising of deep learning, massive labeled data become the solution of feature learning, which enables the model to learn automatically. Unfortunately, the trained deep learning model is hard to adapt to other datasets without fine-tuning, and the applicability of machine learning methods is limited by the amount of available labeled data. Therefore, the aim of this thesis is to alleviate the limitations of supervised learning by exploring algorithms to learn good internal representations, and invariant feature hierarchies from unlabelled data. Firstly, we extend the traditional dictionary learning and sparse coding algorithms onto hierarchical image representations in a principled way. To achieve dictionary atoms capture additional information from extended receptive fields and attain improved descriptive capacity, we present a two-pass multi-resolution cascade framework for dictionary learning and sparse coding. This cascade method allows collaborative reconstructions at different resolutions using only the same dimensional dictionary atoms. The jointly learned dictionary comprises atoms that adapt to the information available at the coarsest layer, where the support of atoms reaches a maximum range, and the residual images, where the supplementary details refine progressively a reconstruction objective. Our method generates flexible and accurate representations using only a small number of coefficients, and is efficient in computation. In the following work, we propose to incorporate the traditional self-expressiveness property into deep learning to explore better representation for subspace clustering. This architecture is built upon deep auto-encoders, which non-linearly map the input data into a latent space. Our key idea is to introduce a novel self-expressive layer between the encoder and the decoder to mimic the ``self-expressiveness'' property that has proven effective in traditional subspace clustering. Being differentiable, our new self-expressive layer provides a simple but effective way to learn pairwise affinities between all data points through a standard back-propagation procedure. Being nonlinear, our neural-network based method is able to cluster data points having complex (often nonlinear) structures. However, Subspace clustering algorithms are notorious for their scalability issues because building and processing large affinity matrices are demanding. We propose two methods to tackle this problem. One method is based on kk-Subspace Clustering, where we introduce a method that simultaneously learns an embedding space along subspaces within it to minimize a notion of reconstruction error, thus addressing the problem of subspace clustering in an end-to-end learning paradigm. This in turn frees us from the need of having an affinity matrix to perform clustering. The other way starts from using a feed forward network to replace the spectral clustering and learn the affinities of each data from "self-expressive" layer. We introduce the Neural Collaborative Subspace Clustering, where it benefits from a classifier which determines whether a pair of points lies on the same subspace under supervision of "self-expressive" layer. Essential to our model is the construction of two affinity matrices, one from the classifier and the other from a notion of subspace self-expressiveness, to supervise training in a collaborative scheme. In summary, we make constributions on how to perform the unsupervised learning in several tasks in this thesis. It starts from traditional sparse coding and dictionary learning perspective in low-level vision. Then, we exploit how to incorporate unsupervised learning in convolutional neural networks without label information and make subspace clustering to large scale dataset. Furthermore, we also extend the clustering on dense prediction task (saliency detection)
    • …
    corecore