11 research outputs found

    Deep Divergence-Based Approach to Clustering

    Get PDF
    A promising direction in deep learning research consists in learning representations and simultaneously discovering cluster structure in unlabeled data by optimizing a discriminative loss function. As opposed to supervised deep learning, this line of research is in its infancy, and how to design and optimize suitable loss functions to train deep neural networks for clustering is still an open question. Our contribution to this emerging field is a new deep clustering network that leverages the discriminative power of information-theoretic divergence measures, which have been shown to be effective in traditional clustering. We propose a novel loss function that incorporates geometric regularization constraints, thus avoiding degenerate structures of the resulting clustering partition. Experiments on synthetic benchmarks and real datasets show that the proposed network achieves competitive performance with respect to other state-of-the-art methods, scales well to large datasets, and does not require pre-training steps

    Reducing Objective Function Mismatch in Deep Clustering with the Unsupervised Companion Objective

    Get PDF
    Preservation of local similarity structure is a key challenge in deep clustering. Many recent deep clustering methods therefore use autoencoders to help guide the model's neural network towards an embedding which is more reflective of the input space geometry. However, recent work has shown that autoencoder-based deep clustering models can suffer from objective function mismatch (OFM). In order to improve the preservation of local similarity structure, while simultaneously having a low OFM, we develop a new auxiliary objective function for deep clustering. Our Unsupervised Companion Objective (UCO) encourages a consistent clustering structure at intermediate layers in the network -- helping the network learn an embedding which is more reflective of the similarity structure in the input space. Since a clustering-based auxiliary objective has the same goal as the main clustering objective, it is less prone to introduce objective function mismatch between itself and the main objective. Our experiments show that attaching the UCO to a deep clustering model improves the performance of the model, and exhibits a lower OFM, compared to an analogous autoencoder-based model

    A Clustering-guided Contrastive Fusion for Multi-view Representation Learning

    Full text link
    The past two decades have seen increasingly rapid advances in the field of multi-view representation learning due to it extracting useful information from diverse domains to facilitate the development of multi-view applications. However, the community faces two challenges: i) how to learn robust representations from a large amount of unlabeled data to against noise or incomplete views setting, and ii) how to balance view consistency and complementary for various downstream tasks. To this end, we utilize a deep fusion network to fuse view-specific representations into the view-common representation, extracting high-level semantics for obtaining robust representation. In addition, we employ a clustering task to guide the fusion network to prevent it from leading to trivial solutions. For balancing consistency and complementary, then, we design an asymmetrical contrastive strategy that aligns the view-common representation and each view-specific representation. These modules are incorporated into a unified method known as CLustering-guided cOntrastiVE fusioN (CLOVEN). We quantitatively and qualitatively evaluate the proposed method on five datasets, demonstrating that CLOVEN outperforms 11 competitive multi-view learning methods in clustering and classification. In the incomplete view scenario, our proposed method resists noise interference better than those of our competitors. Furthermore, the visualization analysis shows that CLOVEN can preserve the intrinsic structure of view-specific representation while also improving the compactness of view-commom representation. Our source code will be available soon at https://github.com/guanzhou-ke/cloven.Comment: 13 pages, 9 figure

    Joint Optimization of an Autoencoder for Clustering and Embedding

    Get PDF
    Incorporating k-means-like clustering techniques into (deep) autoencoders constitutes an interesting idea as the clustering may exploit the learned similarities in the embedding to compute a non-linear grouping of data at-hand. Unfortunately, the resulting contributions are often limited by ad-hoc choices, decoupled optimization problems and other issues. We present a theoretically-driven deep clustering approach that does not suffer from these limitations and allows for joint optimization of clustering and embedding. The network in its simplest form is derived from a Gaussian mixture model and can be incorporated seamlessly into deep autoencoders for state-of-the-art performance

    Exploring Cybertechnology Standards Through Bibliometrics: Case of National Institute of Standards and Technology

    Get PDF
    Cyber security is one of the topics that gain importance today. It is necessary to determine the basic components, basic dynamics, and main actors of the Cyber security issue, which is obvious that it will have an impact in many areas from social, social, economic, environmental, and political aspects, as a hot research topic. When the subject literature is examined, it has become a trend-forming research subject followed by institutions and organizations that produce R&D policy, starting from the level of governments. In this study, cybersecurity research is examined in the context of 5 basic cyber security functions specified in the cyber security standard (CSF) defined by the National Institute of Standards and Technology (NIST). It is aimed to determine the research topics emerging in the international literature, to identify the most productive countries, to determine the rankings created by these countries according to their functions, to determine the research clusters and research focuses. In the study, several quantitative methods were used, especially scientometrics, social network analysis (SNA) line theory and structural hole analysis. Statistical tests (Log-Likelihood Ratio) were used to reveal the prominent areas, and the text mining method was also used. we first defined a workflow according to the “Identify”, “Protect”, “Detect”, “Respond” and “Recover” setups, and conducted an online search on the Web of Science (WoS) to access the information on the publications on the relevant topics It is seen that actors, institutions and research create different densities according to various geographical regions in the 5 functions defined within the framework of cybersecurity. It is possible to say that infiltration detection, the internet of things and the concept of artificial intelligence are among the other prominent research focuses, although it is seen that smart grids are among the most prominent research topics. In the first clustering analysis we performed, we can say that 17 clusters are formed, especially when we look under the definition function. The largest of these clusters has 32 data points, so-called decision making models

    The Conditional Cauchy-Schwarz Divergence with Applications to Time-Series Data and Sequential Decision Making

    Full text link
    The Cauchy-Schwarz (CS) divergence was developed by Pr\'{i}ncipe et al. in 2000. In this paper, we extend the classic CS divergence to quantify the closeness between two conditional distributions and show that the developed conditional CS divergence can be simply estimated by a kernel density estimator from given samples. We illustrate the advantages (e.g., the rigorous faithfulness guarantee, the lower computational complexity, the higher statistical power, and the much more flexibility in a wide range of applications) of our conditional CS divergence over previous proposals, such as the conditional KL divergence and the conditional maximum mean discrepancy. We also demonstrate the compelling performance of conditional CS divergence in two machine learning tasks related to time series data and sequential inference, namely the time series clustering and the uncertainty-guided exploration for sequential decision making.Comment: 23 pages, 7 figure

    Wireless Propagation Multipaths using Spectral Clustering and Three-Constraint Affinity Matrix Spectral Clustering

    Get PDF
    This study focused on spectral clustering (SC) and three-constraint affinity matrix spectral clustering (3CAM-SC) to determine the number of clusters and the membership of the clusters of the COST 2100 channel model (C2CM) multipath dataset simultaneously. Various multipath clustering approaches solve only the number of clusters without taking into consideration the membership of clusters. The problem of giving only the number of clusters is that there is no assurance that the membership of the multipath clusters is accurate even though the number of clusters is correct. SC and 3CAM-SC aimed to solve this problem by determining the membership of the clusters. The cluster and the cluster count were then computed through the cluster-wise Jaccard index of the membership of the multipaths to their clusters. The multipaths generated by C2CM were transformed using the directional cosine transform (DCT) and the whitening transform (WT). The transformed dataset was clustered using SC and 3CAM-SC. The clustering performance was validated using the Jaccard index by comparing the reference multipath dataset with the calculated multipath clusters. The results show that the effectiveness of SC is similar to the state-of-the-art clustering approaches. However, 3CAM-SC outperforms SC in all channel scenarios. SC can be used in indoor scenarios based on accuracy, while 3CAM-SC is applicable in indoor and semi-urban scenarios. Thus, the clustering approaches can be applied as alternative clustering techniques in the field of channel modeling

    Improving Representation Learning for Deep Clustering and Few-shot Learning

    Get PDF
    The amounts of data in the world have increased dramatically in recent years, and it is quickly becoming infeasible for humans to label all these data. It is therefore crucial that modern machine learning systems can operate with few or no labels. The introduction of deep learning and deep neural networks has led to impressive advancements in several areas of machine learning. These advancements are largely due to the unprecedented ability of deep neural networks to learn powerful representations from a wide range of complex input signals. This ability is especially important when labeled data is limited, as the absence of a strong supervisory signal forces models to rely more on intrinsic properties of the data and its representations. This thesis focuses on two key concepts in deep learning with few or no labels. First, we aim to improve representation quality in deep clustering - both for single-view and multi-view data. Current models for deep clustering face challenges related to properly representing semantic similarities, which is crucial for the models to discover meaningful clusterings. This is especially challenging with multi-view data, since the information required for successful clustering might be scattered across many views. Second, we focus on few-shot learning, and how geometrical properties of representations influence few-shot classification performance. We find that a large number of recent methods for few-shot learning embed representations on the hypersphere. Hence, we seek to understand what makes the hypersphere a particularly suitable embedding space for few-shot learning. Our work on single-view deep clustering addresses the susceptibility of deep clustering models to find trivial solutions with non-meaningful representations. To address this issue, we present a new auxiliary objective that - when compared to the popular autoencoder-based approach - better aligns with the main clustering objective, resulting in improved clustering performance. Similarly, our work on multi-view clustering focuses on how representations can be learned from multi-view data, in order to make the representations suitable for the clustering objective. Where recent methods for deep multi-view clustering have focused on aligning view-specific representations, we find that this alignment procedure might actually be detrimental to representation quality. We investigate the effects of representation alignment, and provide novel insights on when alignment is beneficial, and when it is not. Based on our findings, we present several new methods for deep multi-view clustering - both alignment and non-alignment-based - that out-perform current state-of-the-art methods. Our first work on few-shot learning aims to tackle the hubness problem, which has been shown to have negative effects on few-shot classification performance. To this end, we present two new methods to embed representations on the hypersphere for few-shot learning. Further, we provide both theoretical and experimental evidence indicating that embedding representations as uniformly as possible on the hypersphere reduces hubness, and improves classification accuracy. Furthermore, based on our findings on hyperspherical embeddings for few-shot learning, we seek to improve the understanding of representation norms. In particular, we ask what type of information the norm carries, and why it is often beneficial to discard the norm in classification models. We answer this question by presenting a novel hypothesis on the relationship between representation norm and the number of a certain class of objects in the image. We then analyze our hypothesis both theoretically and experimentally, presenting promising results that corroborate the hypothesis