26,587 research outputs found
Deep Clustering With Intra-class Distance Constraint for Hyperspectral Images
The high dimensionality of hyperspectral images often results in the
degradation of clustering performance. Due to the powerful ability of deep
feature extraction and non-linear feature representation, the clustering
algorithm based on deep learning has become a hot research topic in the field
of hyperspectral remote sensing. However, most deep clustering algorithms for
hyperspectral images utilize deep neural networks as feature extractor without
considering prior knowledge constraints that are suitable for clustering. To
solve this problem, we propose an intra-class distance constrained deep
clustering algorithm for high-dimensional hyperspectral images. The proposed
algorithm constrains the feature mapping procedure of the auto-encoder network
by intra-class distance so that raw images are transformed from the original
high-dimensional space to the low-dimensional feature space that is more
conducive to clustering. Furthermore, the related learning process is treated
as a joint optimization problem of deep feature extraction and clustering.
Experimental results demonstrate the intense competitiveness of the proposed
algorithm in comparison with state-of-the-art clustering methods of
hyperspectral images
Learning A Task-Specific Deep Architecture For Clustering
While sparse coding-based clustering methods have shown to be successful,
their bottlenecks in both efficiency and scalability limit the practical usage.
In recent years, deep learning has been proved to be a highly effective,
efficient and scalable feature learning tool. In this paper, we propose to
emulate the sparse coding-based clustering pipeline in the context of deep
learning, leading to a carefully crafted deep model benefiting from both. A
feed-forward network structure, named TAGnet, is constructed based on a
graph-regularized sparse coding algorithm. It is then trained with
task-specific loss functions from end to end. We discover that connecting deep
learning to sparse coding benefits not only the model performance, but also its
initialization and interpretation. Moreover, by introducing auxiliary
clustering tasks to the intermediate feature hierarchy, we formulate DTAGnet
and obtain a further performance boost. Extensive experiments demonstrate that
the proposed model gains remarkable margins over several state-of-the-art
methods
Deep Multimodal Subspace Clustering Networks
We present convolutional neural network (CNN) based approaches for
unsupervised multimodal subspace clustering. The proposed framework consists of
three main stages - multimodal encoder, self-expressive layer, and multimodal
decoder. The encoder takes multimodal data as input and fuses them to a latent
space representation. The self-expressive layer is responsible for enforcing
the self-expressiveness property and acquiring an affinity matrix corresponding
to the data points. The decoder reconstructs the original input data. The
network uses the distance between the decoder's reconstruction and the original
input in its training. We investigate early, late and intermediate fusion
techniques and propose three different encoders corresponding to them for
spatial fusion. The self-expressive layers and multimodal decoders are
essentially the same for different spatial fusion-based approaches. In addition
to various spatial fusion-based methods, an affinity fusion-based network is
also proposed in which the self-expressive layer corresponding to different
modalities is enforced to be the same. Extensive experiments on three datasets
show that the proposed methods significantly outperform the state-of-the-art
multimodal subspace clustering methods
Spectral Clustering via Ensemble Deep Autoencoder Learning (SC-EDAE)
Recently, a number of works have studied clustering strategies that combine
classical clustering algorithms and deep learning methods. These approaches
follow either a sequential way, where a deep representation is learned using a
deep autoencoder before obtaining clusters with k-means, or a simultaneous way,
where deep representation and clusters are learned jointly by optimizing a
single objective function. Both strategies improve clustering performance,
however the robustness of these approaches is impeded by several deep
autoencoder setting issues, among which the weights initialization, the width
and number of layers or the number of epochs. To alleviate the impact of such
hyperparameters setting on the clustering performance, we propose a new model
which combines the spectral clustering and deep autoencoder strengths in an
ensemble learning framework. Extensive experiments on various benchmark
datasets demonstrate the potential and robustness of our approach compared to
state-of-the-art deep clustering methods.Comment: Revised manuscrip
Image Representation Learning Using Graph Regularized Auto-Encoders
We consider the problem of image representation for the tasks of unsupervised
learning and semi-supervised learning. In those learning tasks, the raw image
vectors may not provide enough representation for their intrinsic structures
due to their highly dense feature space. To overcome this problem, the raw
image vectors should be mapped to a proper representation space which can
capture the latent structure of the original data and represent the data
explicitly for further learning tasks such as clustering.
Inspired by the recent research works on deep neural network and
representation learning, in this paper, we introduce the multiple-layer
auto-encoder into image representation, we also apply the locally invariant
ideal to our image representation with auto-encoders and propose a novel
method, called Graph regularized Auto-Encoder (GAE). GAE can provide a compact
representation which uncovers the hidden semantics and simultaneously respects
the intrinsic geometric structure.
Extensive experiments on image clustering show encouraging results of the
proposed algorithm in comparison to the state-of-the-art algorithms on
real-word cases.Comment: 9page
Learning Robust Representations for Computer Vision
Unsupervised learning techniques in computer vision often require learning
latent representations, such as low-dimensional linear and non-linear
subspaces. Noise and outliers in the data can frustrate these approaches by
obscuring the latent spaces.
Our main goal is deeper understanding and new development of robust
approaches for representation learning. We provide a new interpretation for
existing robust approaches and present two specific contributions: a new robust
PCA approach, which can separate foreground features from dynamic background,
and a novel robust spectral clustering method, that can cluster facial images
with high accuracy. Both contributions show superior performance to standard
methods on real-world test sets.Comment: 8 pages, 7 page
Convergent Learning: Do different neural networks learn the same representations?
Recent success in training deep neural networks have prompted active
investigation into the features learned on their intermediate layers. Such
research is difficult because it requires making sense of non-linear
computations performed by millions of parameters, but valuable because it
increases our ability to understand current models and create improved versions
of them. In this paper we investigate the extent to which neural networks
exhibit what we call convergent learning, which is when the representations
learned by multiple nets converge to a set of features which are either
individually similar between networks or where subsets of features span similar
low-dimensional spaces. We propose a specific method of probing
representations: training multiple networks and then comparing and contrasting
their individual, learned representations at the level of neurons or groups of
neurons. We begin research into this question using three techniques to
approximately align different neural networks on a feature level: a bipartite
matching approach that makes one-to-one assignments between neurons, a sparse
prediction approach that finds one-to-many mappings, and a spectral clustering
approach that finds many-to-many mappings. This initial investigation reveals a
few previously unknown properties of neural networks, and we argue that future
research into the question of convergent learning will yield many more. The
insights described here include (1) that some features are learned reliably in
multiple networks, yet other features are not consistently learned; (2) that
units learn to span low-dimensional subspaces and, while these subspaces are
common to multiple networks, the specific basis vectors learned are not; (3)
that the representation codes show evidence of being a mix between a local code
and slightly, but not fully, distributed codes across multiple units.Comment: Published as a conference paper at ICLR 201
Image Annotation using Multi-Layer Sparse Coding
Automatic annotation of images with descriptive words is a challenging
problem with vast applications in the areas of image search and retrieval. This
problem can be viewed as a label-assignment problem by a classifier dealing
with a very large set of labels, i.e., the vocabulary set. We propose a novel
annotation method that employs two layers of sparse coding and performs
coarse-to-fine labeling. Themes extracted from the training data are treated as
coarse labels. Each theme is a set of training images that share a common
subject in their visual and textual contents. Our system extracts coarse labels
for training and test images without requiring any prior knowledge. Vocabulary
words are the fine labels to be associated with images. Most of the annotation
methods achieve low recall due to the large number of available fine labels,
i.e., vocabulary words. These systems also tend to achieve high precision for
highly frequent words only while relatively rare words are more important for
search and retrieval purposes. Our system not only outperforms various
previously proposed annotation systems, but also achieves symmetric response in
terms of precision and recall. Our system scores and maintains high precision
for words with a wide range of frequencies. Such behavior is achieved by
intelligently reducing the number of available fine labels or words for each
image based on coarse labels assigned to it
Self-Supervised Convolutional Subspace Clustering Network
Subspace clustering methods based on data self-expression have become very
popular for learning from data that lie in a union of low-dimensional linear
subspaces. However, the applicability of subspace clustering has been limited
because practical visual data in raw form do not necessarily lie in such linear
subspaces. On the other hand, while Convolutional Neural Network (ConvNet) has
been demonstrated to be a powerful tool for extracting discriminative features
from visual data, training such a ConvNet usually requires a large amount of
labeled data, which are unavailable in subspace clustering applications. To
achieve simultaneous feature learning and subspace clustering, we propose an
end-to-end trainable framework, called Self-Supervised Convolutional Subspace
Clustering Network (SConvSCN), that combines a ConvNet module (for feature
learning), a self-expression module (for subspace clustering) and a spectral
clustering module (for self-supervision) into a joint optimization framework.
Particularly, we introduce a dual self-supervision that exploits the output of
spectral clustering to supervise the training of the feature learning module
(via a classification loss) and the self-expression module (via a spectral
clustering loss). Our experiments on four benchmark datasets show the
effectiveness of the dual self-supervision and demonstrate superior performance
of our proposed approach.Comment: 10 pages, 2 figures, and 5 tables. This paper has been accepted by
CVPR201
Deep Discriminative Clustering Analysis
Traditional clustering methods often perform clustering with low-level
indiscriminative representations and ignore relationships between patterns,
resulting in slight achievements in the era of deep learning. To handle this
problem, we develop Deep Discriminative Clustering (DDC) that models the
clustering task by investigating relationships between patterns with a deep
neural network. Technically, a global constraint is introduced to adaptively
estimate the relationships, and a local constraint is developed to endow the
network with the capability of learning high-level discriminative
representations. By iteratively training the network and estimating the
relationships in a mini-batch manner, DDC theoretically converges and the
trained network enables to generate a group of discriminative representations
that can be treated as clustering centers for straightway clustering. Extensive
experiments strongly demonstrate that DDC outperforms current methods on eight
image, text and audio datasets concurrently
- …