31,942 research outputs found

    A Framework for Deep Constrained Clustering -- Algorithms and Advances

    Full text link
    The area of constrained clustering has been extensively explored by researchers and used by practitioners. Constrained clustering formulations exist for popular algorithms such as k-means, mixture models, and spectral clustering but have several limitations. A fundamental strength of deep learning is its flexibility, and here we explore a deep learning framework for constrained clustering and in particular explore how it can extend the field of constrained clustering. We show that our framework can not only handle standard together/apart constraints (without the well documented negative effects reported earlier) generated from labeled side information but more complex constraints generated from new types of side information such as continuous values and high-level domain knowledge.Comment: Updated for ECML/PKDD 201

    Learning to cluster in order to transfer across domains and tasks

    Full text link
    This paper introduces a novel method to perform transfer learning across domains and tasks, formulating it as a problem of learning to cluster. The key insight is that, in addition to features, we can transfer similarity information and this is sufficient to learn a similarity function and clustering network to perform both domain adaptation and cross-task transfer learning. We begin by reducing categorical information to pairwise constraints, which only considers whether two instances belong to the same class or not. This similarity is category-agnostic and can be learned from data in the source domain using a similarity network. We then present two novel approaches for performing transfer learning using this similarity function. First, for unsupervised domain adaptation, we design a new loss function to regularize classification with a constrained clustering loss, hence learning a clustering network with the transferred similarity metric generating the training inputs. Second, for cross-task learning (i.e., unsupervised clustering with unseen categories), we propose a framework to reconstruct and estimate the number of semantic clusters, again using the clustering network. Since the similarity network is noisy, the key is to use a robust clustering algorithm, and we show that our formulation is more robust than the alternative constrained and unconstrained clustering approaches. Using this method, we first show state of the art results for the challenging cross-task problem, applied on Omniglot and ImageNet. Our results show that we can reconstruct semantic clusters with high accuracy. We then evaluate the performance of cross-domain transfer using images from the Office-31 and SVHN-MNIST tasks and present top accuracy on both datasets. Our approach doesn't explicitly deal with domain discrepancy. If we combine with a domain adaptation loss, it shows further improvement.Comment: ICLR 201

    Deep Clustering With Intra-class Distance Constraint for Hyperspectral Images

    Full text link
    The high dimensionality of hyperspectral images often results in the degradation of clustering performance. Due to the powerful ability of deep feature extraction and non-linear feature representation, the clustering algorithm based on deep learning has become a hot research topic in the field of hyperspectral remote sensing. However, most deep clustering algorithms for hyperspectral images utilize deep neural networks as feature extractor without considering prior knowledge constraints that are suitable for clustering. To solve this problem, we propose an intra-class distance constrained deep clustering algorithm for high-dimensional hyperspectral images. The proposed algorithm constrains the feature mapping procedure of the auto-encoder network by intra-class distance so that raw images are transformed from the original high-dimensional space to the low-dimensional feature space that is more conducive to clustering. Furthermore, the related learning process is treated as a joint optimization problem of deep feature extraction and clustering. Experimental results demonstrate the intense competitiveness of the proposed algorithm in comparison with state-of-the-art clustering methods of hyperspectral images

    Deep Clustering via Joint Convolutional Autoencoder Embedding and Relative Entropy Minimization

    Full text link
    Image clustering is one of the most important computer vision applications, which has been extensively studied in literature. However, current clustering methods mostly suffer from lack of efficiency and scalability when dealing with large-scale and high-dimensional data. In this paper, we propose a new clustering model, called DEeP Embedded RegularIzed ClusTering (DEPICT), which efficiently maps data into a discriminative embedding subspace and precisely predicts cluster assignments. DEPICT generally consists of a multinomial logistic regression function stacked on top of a multi-layer convolutional autoencoder. We define a clustering objective function using relative entropy (KL divergence) minimization, regularized by a prior for the frequency of cluster assignments. An alternating strategy is then derived to optimize the objective by updating parameters and estimating cluster assignments. Furthermore, we employ the reconstruction loss functions in our autoencoder, as a data-dependent regularization term, to prevent the deep embedding function from overfitting. In order to benefit from end-to-end optimization and eliminate the necessity for layer-wise pretraining, we introduce a joint learning framework to minimize the unified clustering and reconstruction loss functions together and train all network layers simultaneously. Experimental results indicate the superiority and faster running time of DEPICT in real-world clustering tasks, where no labeled data is available for hyper-parameter tuning

    FI-GRL: Fast Inductive Graph Representation Learning via Projection-Cost Preservation

    Full text link
    Graph representation learning aims at transforming graph data into meaningful low-dimensional vectors to facilitate the employment of machine learning and data mining algorithms designed for general data. Most current graph representation learning approaches are transductive, which means that they require all the nodes in the graph are known when learning graph representations and these approaches cannot naturally generalize to unseen nodes. In this paper, we present a Fast Inductive Graph Representation Learning framework (FI-GRL) to learn nodes' low-dimensional representations. Our approach can obtain accurate representations for seen nodes with provable theoretical guarantees and can easily generalize to unseen nodes. Specifically, in order to explicitly decouple nodes' relations expressed by the graph, we transform nodes into a randomized subspace spanned by a random projection matrix. This stage is guaranteed to preserve the projection-cost of the normalized random walk matrix which is highly related to the normalized cut of the graph. Then feature extraction is achieved by conducting singular value decomposition on the obtained matrix sketch. By leveraging the property of projection-cost preservation on the matrix sketch, the obtained representation result is nearly optimal. To deal with unseen nodes, we utilize folding-in technique to learn their meaningful representations. Empirically, when the amount of seen nodes are larger than that of unseen nodes, FI-GRL always achieves excellent results. Our algorithm is fast, simple to implement and theoretically guaranteed. Extensive experiments on real datasets demonstrate the superiority of our algorithm on both efficacy and efficiency over both macroscopic level (clustering) and microscopic level (structural hole detection) applications.Comment: ICDM 2018, Full Versio

    A flexible, extensible software framework for model compression based on the LC algorithm

    Full text link
    We propose a software framework based on the ideas of the Learning-Compression (LC) algorithm, that allows a user to compress a neural network or other machine learning model using different compression schemes with minimal effort. Currently, the supported compressions include pruning, quantization, low-rank methods (including automatically learning the layer ranks), and combinations of those, and the user can choose different compression types for different parts of a neural network. The LC algorithm alternates two types of steps until convergence: a learning (L) step, which trains a model on a dataset (using an algorithm such as SGD); and a compression (C) step, which compresses the model parameters (using a compression scheme such as low-rank or quantization). This decoupling of the "machine learning" aspect from the "signal compression" aspect means that changing the model or the compression type amounts to calling the corresponding subroutine in the L or C step, respectively. The library fully supports this by design, which makes it flexible and extensible. This does not come at the expense of performance: the runtime needed to compress a model is comparable to that of training the model in the first place; and the compressed model is competitive in terms of prediction accuracy and compression ratio with other algorithms (which are often specialized for specific models or compression schemes). The library is written in Python and PyTorch and available in Github.Comment: 15 pages, 4 figures, 2 table

    Image Representation Learning Using Graph Regularized Auto-Encoders

    Full text link
    We consider the problem of image representation for the tasks of unsupervised learning and semi-supervised learning. In those learning tasks, the raw image vectors may not provide enough representation for their intrinsic structures due to their highly dense feature space. To overcome this problem, the raw image vectors should be mapped to a proper representation space which can capture the latent structure of the original data and represent the data explicitly for further learning tasks such as clustering. Inspired by the recent research works on deep neural network and representation learning, in this paper, we introduce the multiple-layer auto-encoder into image representation, we also apply the locally invariant ideal to our image representation with auto-encoders and propose a novel method, called Graph regularized Auto-Encoder (GAE). GAE can provide a compact representation which uncovers the hidden semantics and simultaneously respects the intrinsic geometric structure. Extensive experiments on image clustering show encouraging results of the proposed algorithm in comparison to the state-of-the-art algorithms on real-word cases.Comment: 9page

    Deep Multimodal Subspace Clustering Networks

    Full text link
    We present convolutional neural network (CNN) based approaches for unsupervised multimodal subspace clustering. The proposed framework consists of three main stages - multimodal encoder, self-expressive layer, and multimodal decoder. The encoder takes multimodal data as input and fuses them to a latent space representation. The self-expressive layer is responsible for enforcing the self-expressiveness property and acquiring an affinity matrix corresponding to the data points. The decoder reconstructs the original input data. The network uses the distance between the decoder's reconstruction and the original input in its training. We investigate early, late and intermediate fusion techniques and propose three different encoders corresponding to them for spatial fusion. The self-expressive layers and multimodal decoders are essentially the same for different spatial fusion-based approaches. In addition to various spatial fusion-based methods, an affinity fusion-based network is also proposed in which the self-expressive layer corresponding to different modalities is enforced to be the same. Extensive experiments on three datasets show that the proposed methods significantly outperform the state-of-the-art multimodal subspace clustering methods

    Survey of state-of-the-art mixed data clustering algorithms

    Full text link
    Mixed data comprises both numeric and categorical features, and mixed datasets occur frequently in many domains, such as health, finance, and marketing. Clustering is often applied to mixed datasets to find structures and to group similar objects for further analysis. However, clustering mixed data is challenging because it is difficult to directly apply mathematical operations, such as summation or averaging, to the feature values of these datasets. In this paper, we present a taxonomy for the study of mixed data clustering algorithms by identifying five major research themes. We then present a state-of-the-art review of the research works within each research theme. We analyze the strengths and weaknesses of these methods with pointers for future research directions. Lastly, we present an in-depth analysis of the overall challenges in this field, highlight open research questions and discuss guidelines to make progress in the field.Comment: 20 Pages, 2 columns, 6 Tables, 209 Reference

    Deep Transductive Semi-supervised Maximum Margin Clustering

    Full text link
    Semi-supervised clustering is an very important topic in machine learning and computer vision. The key challenge of this problem is how to learn a metric, such that the instances sharing the same label are more likely close to each other on the embedded space. However, little attention has been paid to learn better representations when the data lie on non-linear manifold. Fortunately, deep learning has led to great success on feature learning recently. Inspired by the advances of deep learning, we propose a deep transductive semi-supervised maximum margin clustering approach. More specifically, given pairwise constraints, we exploit both labeled and unlabeled data to learn a non-linear mapping under maximum margin framework for clustering analysis. Thus, our model unifies transductive learning, feature learning and maximum margin techniques in the semi-supervised clustering framework. We pretrain the deep network structure with restricted Boltzmann machines (RBMs) layer by layer greedily, and optimize our objective function with gradient descent. By checking the most violated constraints, our approach updates the model parameters through error backpropagation, in which deep features are learned automatically. The experimental results shows that our model is significantly better than the state of the art on semi-supervised clustering.Comment: 1
    corecore