31,942 research outputs found
A Framework for Deep Constrained Clustering -- Algorithms and Advances
The area of constrained clustering has been extensively explored by
researchers and used by practitioners. Constrained clustering formulations
exist for popular algorithms such as k-means, mixture models, and spectral
clustering but have several limitations. A fundamental strength of deep
learning is its flexibility, and here we explore a deep learning framework for
constrained clustering and in particular explore how it can extend the field of
constrained clustering. We show that our framework can not only handle standard
together/apart constraints (without the well documented negative effects
reported earlier) generated from labeled side information but more complex
constraints generated from new types of side information such as continuous
values and high-level domain knowledge.Comment: Updated for ECML/PKDD 201
Learning to cluster in order to transfer across domains and tasks
This paper introduces a novel method to perform transfer learning across
domains and tasks, formulating it as a problem of learning to cluster. The key
insight is that, in addition to features, we can transfer similarity
information and this is sufficient to learn a similarity function and
clustering network to perform both domain adaptation and cross-task transfer
learning. We begin by reducing categorical information to pairwise constraints,
which only considers whether two instances belong to the same class or not.
This similarity is category-agnostic and can be learned from data in the source
domain using a similarity network. We then present two novel approaches for
performing transfer learning using this similarity function. First, for
unsupervised domain adaptation, we design a new loss function to regularize
classification with a constrained clustering loss, hence learning a clustering
network with the transferred similarity metric generating the training inputs.
Second, for cross-task learning (i.e., unsupervised clustering with unseen
categories), we propose a framework to reconstruct and estimate the number of
semantic clusters, again using the clustering network. Since the similarity
network is noisy, the key is to use a robust clustering algorithm, and we show
that our formulation is more robust than the alternative constrained and
unconstrained clustering approaches. Using this method, we first show state of
the art results for the challenging cross-task problem, applied on Omniglot and
ImageNet. Our results show that we can reconstruct semantic clusters with high
accuracy. We then evaluate the performance of cross-domain transfer using
images from the Office-31 and SVHN-MNIST tasks and present top accuracy on both
datasets. Our approach doesn't explicitly deal with domain discrepancy. If we
combine with a domain adaptation loss, it shows further improvement.Comment: ICLR 201
Deep Clustering With Intra-class Distance Constraint for Hyperspectral Images
The high dimensionality of hyperspectral images often results in the
degradation of clustering performance. Due to the powerful ability of deep
feature extraction and non-linear feature representation, the clustering
algorithm based on deep learning has become a hot research topic in the field
of hyperspectral remote sensing. However, most deep clustering algorithms for
hyperspectral images utilize deep neural networks as feature extractor without
considering prior knowledge constraints that are suitable for clustering. To
solve this problem, we propose an intra-class distance constrained deep
clustering algorithm for high-dimensional hyperspectral images. The proposed
algorithm constrains the feature mapping procedure of the auto-encoder network
by intra-class distance so that raw images are transformed from the original
high-dimensional space to the low-dimensional feature space that is more
conducive to clustering. Furthermore, the related learning process is treated
as a joint optimization problem of deep feature extraction and clustering.
Experimental results demonstrate the intense competitiveness of the proposed
algorithm in comparison with state-of-the-art clustering methods of
hyperspectral images
Deep Clustering via Joint Convolutional Autoencoder Embedding and Relative Entropy Minimization
Image clustering is one of the most important computer vision applications,
which has been extensively studied in literature. However, current clustering
methods mostly suffer from lack of efficiency and scalability when dealing with
large-scale and high-dimensional data. In this paper, we propose a new
clustering model, called DEeP Embedded RegularIzed ClusTering (DEPICT), which
efficiently maps data into a discriminative embedding subspace and precisely
predicts cluster assignments. DEPICT generally consists of a multinomial
logistic regression function stacked on top of a multi-layer convolutional
autoencoder. We define a clustering objective function using relative entropy
(KL divergence) minimization, regularized by a prior for the frequency of
cluster assignments. An alternating strategy is then derived to optimize the
objective by updating parameters and estimating cluster assignments.
Furthermore, we employ the reconstruction loss functions in our autoencoder, as
a data-dependent regularization term, to prevent the deep embedding function
from overfitting. In order to benefit from end-to-end optimization and
eliminate the necessity for layer-wise pretraining, we introduce a joint
learning framework to minimize the unified clustering and reconstruction loss
functions together and train all network layers simultaneously. Experimental
results indicate the superiority and faster running time of DEPICT in
real-world clustering tasks, where no labeled data is available for
hyper-parameter tuning
FI-GRL: Fast Inductive Graph Representation Learning via Projection-Cost Preservation
Graph representation learning aims at transforming graph data into meaningful
low-dimensional vectors to facilitate the employment of machine learning and
data mining algorithms designed for general data. Most current graph
representation learning approaches are transductive, which means that they
require all the nodes in the graph are known when learning graph
representations and these approaches cannot naturally generalize to unseen
nodes. In this paper, we present a Fast Inductive Graph Representation Learning
framework (FI-GRL) to learn nodes' low-dimensional representations. Our
approach can obtain accurate representations for seen nodes with provable
theoretical guarantees and can easily generalize to unseen nodes. Specifically,
in order to explicitly decouple nodes' relations expressed by the graph, we
transform nodes into a randomized subspace spanned by a random projection
matrix. This stage is guaranteed to preserve the projection-cost of the
normalized random walk matrix which is highly related to the normalized cut of
the graph. Then feature extraction is achieved by conducting singular value
decomposition on the obtained matrix sketch. By leveraging the property of
projection-cost preservation on the matrix sketch, the obtained representation
result is nearly optimal. To deal with unseen nodes, we utilize folding-in
technique to learn their meaningful representations. Empirically, when the
amount of seen nodes are larger than that of unseen nodes, FI-GRL always
achieves excellent results. Our algorithm is fast, simple to implement and
theoretically guaranteed. Extensive experiments on real datasets demonstrate
the superiority of our algorithm on both efficacy and efficiency over both
macroscopic level (clustering) and microscopic level (structural hole
detection) applications.Comment: ICDM 2018, Full Versio
A flexible, extensible software framework for model compression based on the LC algorithm
We propose a software framework based on the ideas of the
Learning-Compression (LC) algorithm, that allows a user to compress a neural
network or other machine learning model using different compression schemes
with minimal effort. Currently, the supported compressions include pruning,
quantization, low-rank methods (including automatically learning the layer
ranks), and combinations of those, and the user can choose different
compression types for different parts of a neural network.
The LC algorithm alternates two types of steps until convergence: a learning
(L) step, which trains a model on a dataset (using an algorithm such as SGD);
and a compression (C) step, which compresses the model parameters (using a
compression scheme such as low-rank or quantization). This decoupling of the
"machine learning" aspect from the "signal compression" aspect means that
changing the model or the compression type amounts to calling the corresponding
subroutine in the L or C step, respectively. The library fully supports this by
design, which makes it flexible and extensible. This does not come at the
expense of performance: the runtime needed to compress a model is comparable to
that of training the model in the first place; and the compressed model is
competitive in terms of prediction accuracy and compression ratio with other
algorithms (which are often specialized for specific models or compression
schemes). The library is written in Python and PyTorch and available in Github.Comment: 15 pages, 4 figures, 2 table
Image Representation Learning Using Graph Regularized Auto-Encoders
We consider the problem of image representation for the tasks of unsupervised
learning and semi-supervised learning. In those learning tasks, the raw image
vectors may not provide enough representation for their intrinsic structures
due to their highly dense feature space. To overcome this problem, the raw
image vectors should be mapped to a proper representation space which can
capture the latent structure of the original data and represent the data
explicitly for further learning tasks such as clustering.
Inspired by the recent research works on deep neural network and
representation learning, in this paper, we introduce the multiple-layer
auto-encoder into image representation, we also apply the locally invariant
ideal to our image representation with auto-encoders and propose a novel
method, called Graph regularized Auto-Encoder (GAE). GAE can provide a compact
representation which uncovers the hidden semantics and simultaneously respects
the intrinsic geometric structure.
Extensive experiments on image clustering show encouraging results of the
proposed algorithm in comparison to the state-of-the-art algorithms on
real-word cases.Comment: 9page
Deep Multimodal Subspace Clustering Networks
We present convolutional neural network (CNN) based approaches for
unsupervised multimodal subspace clustering. The proposed framework consists of
three main stages - multimodal encoder, self-expressive layer, and multimodal
decoder. The encoder takes multimodal data as input and fuses them to a latent
space representation. The self-expressive layer is responsible for enforcing
the self-expressiveness property and acquiring an affinity matrix corresponding
to the data points. The decoder reconstructs the original input data. The
network uses the distance between the decoder's reconstruction and the original
input in its training. We investigate early, late and intermediate fusion
techniques and propose three different encoders corresponding to them for
spatial fusion. The self-expressive layers and multimodal decoders are
essentially the same for different spatial fusion-based approaches. In addition
to various spatial fusion-based methods, an affinity fusion-based network is
also proposed in which the self-expressive layer corresponding to different
modalities is enforced to be the same. Extensive experiments on three datasets
show that the proposed methods significantly outperform the state-of-the-art
multimodal subspace clustering methods
Survey of state-of-the-art mixed data clustering algorithms
Mixed data comprises both numeric and categorical features, and mixed
datasets occur frequently in many domains, such as health, finance, and
marketing. Clustering is often applied to mixed datasets to find structures and
to group similar objects for further analysis. However, clustering mixed data
is challenging because it is difficult to directly apply mathematical
operations, such as summation or averaging, to the feature values of these
datasets. In this paper, we present a taxonomy for the study of mixed data
clustering algorithms by identifying five major research themes. We then
present a state-of-the-art review of the research works within each research
theme. We analyze the strengths and weaknesses of these methods with pointers
for future research directions. Lastly, we present an in-depth analysis of the
overall challenges in this field, highlight open research questions and discuss
guidelines to make progress in the field.Comment: 20 Pages, 2 columns, 6 Tables, 209 Reference
Deep Transductive Semi-supervised Maximum Margin Clustering
Semi-supervised clustering is an very important topic in machine learning and
computer vision. The key challenge of this problem is how to learn a metric,
such that the instances sharing the same label are more likely close to each
other on the embedded space. However, little attention has been paid to learn
better representations when the data lie on non-linear manifold. Fortunately,
deep learning has led to great success on feature learning recently. Inspired
by the advances of deep learning, we propose a deep transductive
semi-supervised maximum margin clustering approach. More specifically, given
pairwise constraints, we exploit both labeled and unlabeled data to learn a
non-linear mapping under maximum margin framework for clustering analysis.
Thus, our model unifies transductive learning, feature learning and maximum
margin techniques in the semi-supervised clustering framework. We pretrain the
deep network structure with restricted Boltzmann machines (RBMs) layer by layer
greedily, and optimize our objective function with gradient descent. By
checking the most violated constraints, our approach updates the model
parameters through error backpropagation, in which deep features are learned
automatically. The experimental results shows that our model is significantly
better than the state of the art on semi-supervised clustering.Comment: 1
- …