2 research outputs found
DRBM-ClustNet: A Deep Restricted Boltzmann-Kohonen Architecture for Data Clustering
A Bayesian Deep Restricted Boltzmann-Kohonen architecture for data clustering
termed as DRBM-ClustNet is proposed. This core-clustering engine consists of a
Deep Restricted Boltzmann Machine (DRBM) for processing unlabeled data by
creating new features that are uncorrelated and have large variance with each
other. Next, the number of clusters are predicted using the Bayesian
Information Criterion (BIC), followed by a Kohonen Network-based clustering
layer. The processing of unlabeled data is done in three stages for efficient
clustering of the non-linearly separable datasets. In the first stage, DRBM
performs non-linear feature extraction by capturing the highly complex data
representation by projecting the feature vectors of dimensions into
dimensions. Most clustering algorithms require the number of clusters to be
decided a priori, hence here to automate the number of clusters in the second
stage we use BIC. In the third stage, the number of clusters derived from BIC
forms the input for the Kohonen network, which performs clustering of the
feature-extracted data obtained from the DRBM. This method overcomes the
general disadvantages of clustering algorithms like the prior specification of
the number of clusters, convergence to local optima and poor clustering
accuracy on non-linear datasets. In this research we use two synthetic
datasets, fifteen benchmark datasets from the UCI Machine Learning repository,
and four image datasets to analyze the DRBM-ClustNet. The proposed framework is
evaluated based on clustering accuracy and ranked against other
state-of-the-art clustering methods. The obtained results demonstrate that the
DRBM-ClustNet outperforms state-of-the-art clustering algorithms.Comment: 14 pages, 7 figure
BELMKN : Bayesian extreme learning machines Kohonen Network
This paper proposes the Bayesian Extreme Learning Machine Kohonen Network (BELMKN) framework to solve the clustering problem. The BELMKN framework uses three levels in processing nonlinearly separable datasets to obtain efficient clustering in terms of accuracy. In the first level, the Extreme Learning Machine (ELM)-based feature learning approach captures the nonlinearity in the data distribution by mapping it onto a d-dimensional space. In the second level, ELM-based feature extracted data is used as an input for Bayesian Information Criterion (BIC) to predict the number of clusters termed as a cluster prediction. In the final level, feature-extracted data along with the cluster prediction is passed to the Kohonen Network to obtain improved clustering accuracy. The main advantage of the proposed method is to overcome the problem of having a priori identifiers or class labels for the data; it is difficult to obtain labels in most of the cases for the real world datasets. The BELMKN framework is applied to 3 synthetic datasets and 10 benchmark datasets from the UCI machine learning repository and compared with the state-of-the-art clustering methods. The experimental results show that the proposed BELMKN-based clustering outperforms other clustering algorithms for the majority of the datasets. Hence, the BELMKN framework can be used to improve the clustering accuracy of the nonlinearly separable datasets.Published versio