160 research outputs found
Attribute Graph Clustering via Learnable Augmentation
Contrastive deep graph clustering (CDGC) utilizes contrastive learning to
group nodes into different clusters. Better augmentation techniques benefit the
quality of the contrastive samples, thus being one of key factors to improve
performance. However, the augmentation samples in existing methods are always
predefined by human experiences, and agnostic from the downstream task
clustering, thus leading to high human resource costs and poor performance. To
this end, we propose an Attribute Graph Clustering method via Learnable
Augmentation (\textbf{AGCLA}), which introduces learnable augmentors for
high-quality and suitable augmented samples for CDGC. Specifically, we design
two learnable augmentors for attribute and structure information, respectively.
Besides, two refinement matrices, including the high-confidence pseudo-label
matrix and the cross-view sample similarity matrix, are generated to improve
the reliability of the learned affinity matrix. During the training procedure,
we notice that there exist differences between the optimization goals for
training learnable augmentors and contrastive learning networks. In other
words, we should both guarantee the consistency of the embeddings as well as
the diversity of the augmented samples. Thus, an adversarial learning mechanism
is designed in our method. Moreover, a two-stage training strategy is leveraged
for the high-confidence refinement matrices. Extensive experimental results
demonstrate the effectiveness of AGCLA on six benchmark datasets
Visual Tactile Fusion Object Clustering
Object clustering, aiming at grouping similar objects into one cluster with
an unsupervised strategy, has been extensivelystudied among various data-driven
applications. However, most existing state-of-the-art object clustering methods
(e.g., single-view or multi-view clustering methods) only explore visual
information, while ignoring one of most important sensing modalities, i.e.,
tactile information which can help capture different object properties and
further boost the performance of object clustering task. To effectively benefit
both visual and tactile modalities for object clustering, in this paper, we
propose a deep Auto-Encoder-like Non-negative Matrix Factorization framework
for visual-tactile fusion clustering. Specifically, deep matrix factorization
constrained by an under-complete Auto-Encoder-like architecture is employed to
jointly learn hierarchical expression of visual-tactile fusion data, and
preserve the local structure of data generating distribution of visual and
tactile modalities. Meanwhile, a graph regularizer is introduced to capture the
intrinsic relations of data samples within each modality. Furthermore, we
propose a modality-level consensus regularizer to effectively align thevisual
and tactile data in a common subspace in which the gap between visual and
tactile data is mitigated. For the model optimization, we present an efficient
alternating minimization strategy to solve our proposed model. Finally, we
conduct extensive experiments on public datasets to verify the effectiveness of
our framework.Comment: 8 pages, 5 figure
Unsupervised Learning from Shollow to Deep
Machine learning plays a pivotal role in most state-of-the-art systems in many application research domains. With the rising of deep learning, massive labeled data become the solution of feature learning, which enables the model to learn automatically. Unfortunately, the trained deep learning model is hard to adapt to other datasets without fine-tuning, and the applicability of machine learning methods is limited by the amount of available labeled data. Therefore, the aim of this thesis is to alleviate the limitations of supervised learning by exploring algorithms to learn good internal representations, and invariant feature hierarchies from unlabelled data.
Firstly, we extend the traditional dictionary learning and sparse coding algorithms onto hierarchical image representations in a principled way. To achieve dictionary atoms capture additional information from extended receptive fields and attain improved descriptive capacity, we present a two-pass multi-resolution cascade framework for dictionary learning and sparse coding. This cascade method allows collaborative reconstructions at different resolutions using only the same dimensional dictionary atoms. The jointly learned dictionary comprises atoms that adapt to the information available at the coarsest layer, where the support of atoms reaches a maximum range, and the residual images, where the supplementary details refine progressively a reconstruction objective. Our method generates flexible and accurate representations using only a small number of coefficients, and is efficient in computation.
In the following work, we propose to incorporate the traditional self-expressiveness property into deep learning to explore better representation for subspace clustering. This architecture is built upon deep auto-encoders, which non-linearly map the input data into a latent space. Our key idea is to introduce a novel self-expressive layer between the encoder and the decoder to mimic the ``self-expressiveness'' property that has proven effective in traditional subspace clustering. Being differentiable, our new self-expressive layer provides a simple but effective way to learn pairwise affinities between all data points through a standard back-propagation procedure. Being nonlinear, our neural-network based method is able to cluster data points having complex (often nonlinear) structures.
However, Subspace clustering algorithms are notorious for their scalability issues because building and processing large affinity matrices are demanding. We propose two methods to tackle this problem. One method is based on -Subspace Clustering, where we introduce a method that simultaneously learns an embedding space along subspaces within it to minimize a notion of reconstruction error, thus addressing the problem of subspace clustering in an end-to-end learning paradigm. This in turn frees us from the need of having an affinity matrix to perform clustering. The other way starts from using a feed forward network to replace the spectral clustering and learn the affinities of each data from "self-expressive" layer. We introduce the Neural Collaborative Subspace Clustering, where it benefits from a classifier which determines whether a pair of points lies on the same subspace under supervision of "self-expressive" layer. Essential to our model is the construction of two affinity matrices, one from the classifier and the other from a notion of subspace self-expressiveness, to supervise training in a collaborative scheme.
In summary, we make constributions on how to perform the unsupervised learning in several tasks in this thesis. It starts from traditional sparse coding and dictionary learning perspective in low-level vision. Then, we exploit how to incorporate unsupervised learning in convolutional neural networks without label information and make subspace clustering to large scale dataset. Furthermore, we also extend the clustering on dense prediction task (saliency detection)
- …